DataFusionSharp
.NET bindings for Apache DataFusion, a fast, extensible query engine built on Apache Arrow for high-performance analytical query processing.
Note: This is an independent community project and is not officially associated with or endorsed by the Apache Software Foundation or the Apache DataFusion project.
Features
- SQL queries — Execute SQL against CSV, Parquet, and JSON files
- Apache Arrow — Zero-copy data exchange via the Arrow columnar format
- Object stores — Query data on S3, Azure Blob Storage, Google Cloud Storage, or local filesystem
- Hive partitioning — Read and write Hive-style partitioned datasets
- Parameterized queries — Bind named parameters with type-safe
ScalarValuetypes - Cross-platform — Linux (x64/arm64), Windows (x64), macOS (arm64)
Quick Install
dotnet add package DataFusionSharp
Minimal Example
using DataFusionSharp;
using var runtime = DataFusionRuntime.Create();
using var context = runtime.CreateSessionContext();
await context.RegisterCsvAsync("orders", "orders.csv");
using var df = await context.SqlAsync("SELECT customer_id, sum(amount) AS total FROM orders GROUP BY customer_id");
await df.ShowAsync();
Next Steps
- Installation — prerequisites and platform support
- Quick Start — full working example walkthrough
- Core Concepts — understand the runtime, session, and DataFrame model