Acknowledgments
Introduction
- Feedback
1What Is a Query Engine?
- 1.1Why Are Query Engines Popular?
- 1.2What This Book Covers
- 1.3Source Code
- 1.4Why Kotlin?
2Apache Arrow
- 2.1Arrow Memory Model
- 2.2Inter-Process Communication (IPC)
- 2.3Compute Kernels
- 2.4Arrow Flight Protocol
- 2.5Arrow Flight SQL
- 2.6Query Engines
3Choosing a Type System
- Source code
- 3.1Row-Based or Columnar?
- 3.2Interoperability
- 3.3Type System
4Data Sources
- Source code
- 4.1Data Source Interface
- 4.2Data Source Examples
5Logical Plans & Expressions
- Source code
- 5.1Printing Logical Plans
- 5.2Serialization
- 5.3Logical Expressions
- 5.4Column Expressions
- 5.5Literal Expressions
- 5.6Binary Expressions
- 5.7Comparison Expressions
- 5.8Boolean Expressions
- 5.9Math Expressions
- 5.10Aggregate Expressions
- 5.11Logical Plans
- 5.12Scan
- 5.13Projection
- 5.14Selection (also known as Filter)
- 5.15Aggregate
6Building Logical Plans
- Source code
- 6.1Building Logical Plans The Hard Way
- 6.2Building Logical Plans using DataFrames
7Physical Plans & Expressions
- Source code
- 7.1Physical Expressions
- 7.2Column Expressions
- 7.3Literal Expressions
- 7.4Binary Expressions
- 7.5Comparison Expressions
- 7.6Math Expressions
- 7.7Aggregate Expressions
- 7.8Physical Plans
- 7.9Scan
- 7.10Projection
- 7.11Selection (also known as Filter)
- 7.12Hash Aggregate
- 7.13Joins
- 7.14Subqueries
- 7.15Creating Physical Plans
8Query Planner
- Source code
- 8.1Translating Logical Expressions
- 8.2Column Expressions
- 8.3Literal Expressions
- 8.4Binary Expressions
- 8.5Translating Logical Plans
- 8.6Scan
- 8.7Projection
- 8.8Selection (also known as Filter)
- 8.9Aggregate
9Query Optimizations
- Source code
- 9.1Rule-Based Optimizations
- 9.2Cost-Based Optimizations
10Query Execution
- 10.1Apache Spark Example
- Source code
- 10.2KQuery Examples
- Source code
- 10.3Removing The Query Optimizer
11SQL Support
- Source code
- 11.1Tokenizer
- 11.2Pratt Parser
- 11.3Parsing SQL Expressions
- 11.4Parsing a SELECT statement
- 11.5SQL Query Planner
- 11.6Translating SQL Expressions
- 11.7Planning SELECT
- 11.8Planning for Aggregate Queries
12Parallel Query Execution
- 12.1Combining Results
- 12.2Smarter Partitioning
- 12.3Partition Keys
- 12.4Parallel Joins
13Distributed Query Execution
- 13.1Embarrassingly Parallel Operators
- 13.2Distributed Aggregates
- 13.3Distributed Joins
- 13.4Distributed Query Scheduling
- 13.5Producing a Distributed Query Plan
- 13.6Serializing a Query Plan
- 13.7Serializing Data
- 13.8Choosing a Protocol
- 13.9Streaming
- 13.10Custom Code
- 13.11Distributed Query Optimizations
14Testing
- 14.1Unit Testing
- 14.2Integration Testing
- 14.3Fuzzing
15Benchmarks
- 15.1Measuring Performance
- 15.2Measuring Scalability
- 15.3Concurrency
- 15.4Automation
- 15.5Comparing Benchmarks
- 15.6Publishing Benchmark Results
- 15.7Transaction Processing Council (TPC) Benchmarks
Further Resources
- Open-Source Projects
- YouTube
- Sample Data
