Alex Merced's Data, Dev and AI Blog

Connect Oracle Database to Dremio Cloud: Enterprise Analytics Without Data Movement

2026-03-01 | 11 min read | Dremio, Connectors, Data Integration

Cover Image

Oracle Database runs the most critical enterprise applications in the world — ERP systems, financial ledgers, supply chain management, and HR platforms. These systems generate massive volumes of data that business teams want to analyze, but running analytical queries directly against Oracle is expensive (license costs scale with CPU usage), complex (Oracle-specific SQL dialects and tooling), and risky (heavy queries can impact transactional performance).

Dremio Cloud connects to Oracle Database and queries it in place using standard SQL. You don't need to license additional Oracle tools, build ETL pipelines, or export data to a separate warehouse. Dremio pushes filters and aggregations to Oracle, fetches only the results, and lets you join Oracle data with every other source in your organization in a single query.

This guide walks through the complete setup, including Oracle-specific features like native encryption, user impersonation, service name configuration, and the extensive predicate pushdown support.

Why Oracle Users Need Dremio

Oracle licensing costs make analytics expensive. Oracle licenses are typically tied to CPU cores. Running analytical workloads on your production Oracle instance consumes CPU, which means higher licensing costs. Dremio's Reflections create pre-computed copies of frequently queried Oracle data. After the initial query, subsequent analytics hit the Reflection — not Oracle — reducing CPU consumption and license exposure.

Cross-system analytics require ETL. Your financial data is in Oracle, your CRM data is in PostgreSQL, and your marketing data is in S3. Without a federation layer, joining these requires building ETL pipelines that extract data from each source, transform it, and load it into a central warehouse. That's months of engineering work. Dremio federates across all three sources with a single SQL query.

Oracle's analytical tooling is Oracle-specific. Oracle Analytics Cloud, Oracle BI, and Oracle Data Integrator work well within the Oracle ecosystem but don't extend to non-Oracle data. Dremio provides a vendor-neutral SQL layer that works with any BI tool (Tableau, Power BI, Looker) via Arrow Flight or ODBC, covering Oracle and every other connected source.

No semantic layer for AI. Oracle tables use technical names and lack the business context that AI agents need to generate accurate SQL. Dremio's semantic layer lets you create views with business logic, attach wiki descriptions, and enable the AI Agent to answer questions like "What's our quarterly revenue by product line?" by understanding what "quarterly revenue" means from your metadata.

Prerequisites

Before connecting Oracle to Dremio Cloud, confirm you have:

  • Oracle hostname or IP address
  • Port number — Oracle defaults to 1521
  • Service name — the Oracle service name (not the SID) for your database
  • Username and password — an Oracle user with SELECT privileges on the relevant schemas
  • Network access — port 1521 must be reachable from Dremio Cloud
  • Dremio Cloud accountsign up free for 30 days

Step-by-Step: Connect Oracle to Dremio Cloud

1. Add the Oracle Source

In the Dremio console, click the "+" in the left sidebar and select Oracle from the database source types.

2. Configure General Settings

  • Name: A descriptive identifier (e.g., erp-oracle or finance-oracle).
  • Host: The Oracle server hostname.
  • Port: Default 1521.
  • Service Name: The Oracle service name for your database.
  • Enable TLS encryption: Toggle this on for encrypted connections over TLS.
  • Oracle Native Encryption: If you don't use TLS, Oracle supports its own encryption protocol. Options are:
    • Accepted (default): Allows both encrypted and unencrypted connections.
    • Requested: Prefers encryption but accepts unencrypted if not available.
    • Required: Only encrypted connections allowed.
    • Rejected: No encryption.

You can use either TLS or Oracle Native Encryption, but not both on the same source.

3. Set Authentication

Three options:

  • Master Authentication: Username and password entered directly.
  • Secret Resource URL: Password stored in AWS Secrets Manager, referenced by ARN.
  • Kerberos: For environments where Oracle is configured with Kerberos authentication.

4. Configure Advanced Options

Oracle has several unique advanced settings:

Setting What It Does
Use timezone as connection region Uses the timezone to set the connection region
Include synonyms Makes Oracle synonyms visible as datasets in Dremio
Map Oracle DATE to TIMESTAMP Oracle's DATE type includes time components. Enable this to expose them as TIMESTAMP in Dremio instead of truncating to DATE
Record fetch size Rows per batch (default 200, set 0 for automatic)
Use LDAP Naming Services Authenticate via LDAP rather than Oracle's local user database
User Impersonation Run queries under each Dremio user's own Oracle credentials (see below)
Connection Properties Custom JDBC parameters

5. User Impersonation (Optional but Valuable)

Oracle supports user impersonation through proxy authentication. This means each Dremio user runs queries under their own Oracle username, with their own Oracle permissions, rather than sharing a single service account.

To set this up:

  1. Ensure each Dremio user has a matching username in Oracle.
  2. In Oracle, grant proxy authentication: ALTER USER analyst_user GRANT CONNECT THROUGH dremio_service_user;
  3. In Dremio's source settings, enable User Impersonation under Advanced Options.

This is particularly valuable in regulated industries where audit trails need to track which individual accessed which data.

6. Save the Connection

Configure Reflection Refresh, Metadata Refresh, and Privileges as needed, then click Save.

Query Oracle Data from Dremio

Browse your Oracle schemas and tables, then run standard SQL:

SELECT department_id, department_name, manager_id, location_id
FROM "erp-oracle".HR.DEPARTMENTS
WHERE location_id = 1700;

Dremio pushes the WHERE clause to Oracle and transfers only the matching rows.

Federate Oracle with Other Sources

Combine Oracle ERP data with S3 data and PostgreSQL data in one query:

SELECT
  d.department_name,
  COUNT(e.employee_id) AS headcount,
  AVG(e.salary) AS avg_salary,
  SUM(b.budget_amount) AS total_budget
FROM "erp-oracle".HR.DEPARTMENTS d
JOIN "erp-oracle".HR.EMPLOYEES e ON d.department_id = e.department_id
LEFT JOIN "finance-postgres".budgets.dept_budgets b ON d.department_id = b.dept_id
GROUP BY d.department_name
ORDER BY total_budget DESC;

Oracle handles the department-employee join (predicate pushdown), and Dremio handles the cross-source join with PostgreSQL budget data.

Predicate Pushdown Support

Oracle has one of the most comprehensive pushdown profiles in Dremio. The engine offloads:

  • All standard comparisons and logical operators
  • Aggregations: SUM, AVG, COUNT, MIN, MAX, STDDEV, MEDIAN, VAR_POP, VAR_SAMP, COVAR_POP, COVAR_SAMP, PERCENTILE_CONT, PERCENTILE_DISC
  • Math functions: ABS, CEIL, FLOOR, ROUND, MOD, SQRT, POWER, LOG, EXP, SIGN, trigonometric functions (SIN, COS, TAN, ASIN, ACOS, ATAN, SINH, COSH, TANH)
  • String functions: CONCAT, SUBSTR, LENGTH, LOWER, UPPER, TRIM, REPLACE, REVERSE, LPAD, RPAD
  • Date functions: DATE_ADD, DATE_SUB, DATE_TRUNC, EXTRACT, ADD_MONTHS, LAST_DAY, TO_CHAR, TO_DATE, TRUNC

This extensive pushdown support means Oracle does most of the heavy lifting for filtering and aggregation, and Dremio only transfers the summarized results across the network.

Data Type Mapping

Oracle Dremio Notes
NUMBER DECIMAL Preserves precision
VARCHAR2 / NVARCHAR2 / CHAR / NCHAR VARCHAR
DATE DATE or TIMESTAMP Use advanced option to map to TIMESTAMP
TIMESTAMP TIMESTAMP
BINARY_FLOAT FLOAT
BINARY_DOUBLE DOUBLE
FLOAT DOUBLE
BLOB / RAW / LONG RAW VARBINARY
LONG VARCHAR
INTERVALDS INTERVAL (day to seconds)
INTERVALYM INTERVAL (years to months)

CLOB, XMLTYPE, and Oracle spatial types are not supported through the connector.

Build a Semantic Layer Over Oracle

Create views that translate Oracle's technical schema into business-friendly analytics:

CREATE VIEW analytics.gold.department_performance AS
SELECT
  d.department_name,
  COUNT(e.employee_id) AS employee_count,
  ROUND(AVG(e.salary), 2) AS avg_salary,
  MAX(e.hire_date) AS most_recent_hire,
  CASE
    WHEN COUNT(e.employee_id) > 50 THEN 'Large'
    WHEN COUNT(e.employee_id) > 20 THEN 'Medium'
    ELSE 'Small'
  END AS department_size
FROM "erp-oracle".HR.DEPARTMENTS d
LEFT JOIN "erp-oracle".HR.EMPLOYEES e ON d.department_id = e.department_id
GROUP BY d.department_name;

Attach wiki context via the Catalog (edit pencil icon → Details → Generate Wiki/Tags) so the AI Agent can answer questions like "Which large departments have the highest average salary?"

When to Keep Data in Oracle vs. Migrate to Iceberg

Keep in Oracle: Actively transactional data (current orders, inventory, ledger entries), data that applications read and write frequently, data subject to Oracle-specific constraints and triggers.

Migrate to Iceberg: Historical archives (closed fiscal quarters, past-year orders), aggregated reporting tables, datasets queried heavily for analytics but rarely written.

For data that stays in Oracle, create manual Reflections with a refresh schedule that balances data freshness against Oracle CPU usage. For migrated data, Dremio's Open Catalog provides automated compaction, time travel, and Autonomous Reflections.

AI-Powered Analytics on Oracle Data

Dremio AI Agent

The AI Agent lets business users ask questions about Oracle data in plain English. An HR director asks "Which large departments have the highest average salary?" and the Agent generates accurate SQL by reading the wiki descriptions on your department_performance view. The Agent understands what "large" means (employee_count > 50) because you've defined it in the semantic layer.

This is particularly valuable for Oracle environments where decades of institutional knowledge about schema structures, table naming conventions (like HR.DEPARTMENTS), and column semantics lives in senior DBAs' heads.

Dremio MCP Server

The Dremio MCP Server connects Claude, ChatGPT, and other AI clients to your Oracle data through Dremio:

  1. Create a Native OAuth application in Dremio Cloud
  2. Configure redirect URLs (e.g., https://claude.ai/api/mcp/auth_callback for Claude, https://chatgpt.com/connector_platform_oauth_redirect for ChatGPT)
  3. Connect using mcp.dremio.cloud/mcp/{project_id} (US) or mcp.eu.dremio.cloud/mcp/{project_id} (EU)

A CFO asks Claude "Compare department headcount and budget utilization across our Oracle ERP" and gets governed, accurate results from Oracle data — without knowing SQL or Oracle table structures.

AI SQL Functions

Use AI directly in queries against Oracle data:

-- Classify departments by operational health
SELECT
  department_name,
  employee_count,
  avg_salary,
  department_size,
  AI_CLASSIFY(
    'Based on these HR metrics, classify the department health',
    'Department: ' || department_name || ', Employees: ' || CAST(employee_count AS VARCHAR) || ', Avg Salary: $' || CAST(avg_salary AS VARCHAR) || ', Size: ' || department_size,
    ARRAY['Thriving', 'Stable', 'Understaffed', 'Needs Attention']
  ) AS department_health
FROM analytics.gold.department_performance;

-- Generate executive briefings from Oracle data
SELECT
  department_name,
  AI_GENERATE(
    'Write a one-sentence executive summary for this department',
    'Department: ' || department_name || ', Headcount: ' || CAST(employee_count AS VARCHAR) || ', Avg Salary: $' || CAST(avg_salary AS VARCHAR) || ', Most Recent Hire: ' || CAST(most_recent_hire AS VARCHAR)
  ) AS executive_summary
FROM analytics.gold.department_performance
WHERE department_size = 'Large';

AI_CLASSIFY runs LLM inference to categorize departments. AI_GENERATE creates narrative summaries. Both run inline in your SQL queries, enriching Oracle data with AI.

Reflections for Performance

Oracle Database licensing is expensive — especially Enterprise Edition with Analytics and Diagnostics Packs. Reflections offload analytical queries:

  1. Navigate to the view in the Catalog
  2. Click the Reflections tab
  3. Choose Raw Reflection or Aggregation Reflection
  4. Select columns and set the Refresh Interval — for HR data, daily; for financial data, match to reporting cycles
  5. Click Save

BI tools get sub-second responses from Reflections. Oracle focuses on transactional workloads. A department performance dashboard with hourly refreshes generates zero Oracle CPU consumption after the Reflection is built.

Governance on Oracle Data

Oracle has its own security model (Oracle Database Vault, VPD), but it doesn't extend to non-Oracle sources. Dremio's Fine-Grained Access Control (FGAC) provides unified governance:

  • Column masking: Mask salary data, employee SSNs, and performance ratings from specific roles. An HR generalist sees headcount but not compensation details.
  • Row-level filtering: Department-level access — a department manager sees only their department. Regional HR sees only their region.
  • Unified policies: Same governance applies across Oracle, PostgreSQL, S3, and all other sources.

These policies apply across SQL Runner, BI tools (Arrow Flight/ODBC), AI Agent, and MCP Server.

Connect BI Tools via Arrow Flight

Arrow Flight provides 10-100x faster data transfer than JDBC/ODBC:

  • Tableau: Dremio connector for direct Arrow Flight access
  • Power BI: Dremio ODBC driver or native connector — no Oracle client needed
  • Python/Pandas: pyarrow.flight for programmatic data access
  • Looker: Connect via JDBC
  • dbt: dbt-dremio for SQL-based transformations

All queries benefit from Reflections, governance, and the semantic layer.

VS Code Copilot Integration

Dremio's VS Code extension with Copilot integration lets developers query Oracle data from their IDE. Ask Copilot "Show me understaffed departments from Oracle HR" and get SQL generated from your semantic layer — without knowing Oracle schema conventions.

When to Keep Data in Oracle vs. Migrate

Keep in Oracle: Transactional data for active applications, data with PL/SQL dependencies (stored procedures, triggers, packages), data subject to Oracle RAC clustering, data managed by Oracle GoldenGate replication.

Migrate to Iceberg: Historical HR data and archives, closed fiscal year financials, data consumed by non-Oracle tools, datasets where Oracle per-core licensing exceeds analytical value. Migrated Iceberg tables get automatic compaction, time travel, and Autonomous Reflections.

For data staying in Oracle, create manual Reflections to reduce Oracle CPU load. For migrated Iceberg data, Dremio handles optimization automatically.

Get Started

Oracle Database users pay a premium for their database's reliability and enterprise features. Dremio Cloud lets you extract analytical value from that data without additional Oracle licensing, ETL pipelines, or vendor-specific BI tools.

Try Dremio Cloud free for 30 days and connect your Oracle databases alongside your other data sources.

Check out these related posts:

📬 Join the Mailing List

Get updates directly to your inbox.

Subscribe Now

Menu

Search