Apache Doris is a high-performance, real-time analytical database based on MPP (Massively Parallel Processing) architecture. If you need to build and debug Apache Doris from source, this guide provides a step-by-step approach, including setup, dependencies, troubleshooting, and debugging techniques.
Prerequisites
Before compiling Apache Doris from source, ensure you have the required dependencies and system setup:
- Operating System: Ubuntu 20.04+ / CentOS 7+ / macOS
- CPU: x86-64 or ARM64
- Memory: At least 16GB RAM (recommended 32GB for large builds)
- Disk Space: 50GB free (to accommodate build artifacts and dependencies)
- Development Tools:
- GCC (>= 8.3.0) or Clang
- CMake (>= 3.13)
- Maven (>= 3.6)
- JDK (>= 8, recommended JDK 11)
- Python (>= 3.6)
- Node.js (for building frontend UI)
Ensure you install these dependencies before proceeding.
Cloning the Source Code
To get the latest stable or development version of Apache Doris, clone the repository from GitHub:
# Clone the Apache Doris repository
git clone --recursive https://github.com/apache/doris.git
cd doris
Make sure to fetch all submodules with --recursive
to avoid missing dependencies.
Installing Required Dependencies
Use the following commands to install dependencies on different platforms:
Ubuntu
sudo apt update
sudo apt install -y cmake gcc g++ libtool automake flex bison \
pkg-config curl unzip git python3 python3-pip openjdk-11-jdk \
maven nodejs
CentOS
sudo yum groupinstall -y "Development Tools"
sudo yum install -y epel-release cmake3 gcc gcc-c++ libtool \
automake flex bison pkg-config curl unzip git python3 \
python3-pip java-11-openjdk maven nodejs
Compiling Apache Doris
Apache Doris consists of three major components:
- BE (Backend) – The core processing engine
- FE (Frontend) – Query parsing and optimization layer
- Third-party libraries – Additional dependencies
Step 1: Building Third-Party Libraries
Before compiling Doris, build the third-party dependencies:
cd thirdparty
sh build-thirdparty.sh
cd ..
This step downloads and compiles necessary libraries, such as Boost, Thrift, and Protobuf.
Step 2: Compiling the Backend (BE)
Compile the BE using the following commands:
sh build.sh --be
This process may take 10-30 minutes depending on your system.
Step 3: Compiling the Frontend (FE)
To build the frontend (FE):
sh build.sh --fe
If everything compiles successfully, the binaries will be located in the output
directory.
Running Apache Doris
Once compiled, start both BE and FE:
# Start Frontend
sh fe/bin/start_fe.sh --daemon
# Start Backend
sh be/bin/start_be.sh --daemon
Check the logs in log/
directory for any startup issues.
Troubleshooting Common Issues
1. Compilation Errors
If you encounter compilation issues, try the following:
- Ensure you have the correct versions of dependencies.
- Clean previous builds and retry:
sh build.sh clean
- Check the error logs for missing packages or incompatible versions.
2. Service Fails to Start
Check the logs:
tail -f fe/log/fe.log
Ensure that ports (default: 8030 for FE, 9050 for BE) are not occupied by other processes.
3. Missing Third-Party Libraries
If the build process complains about missing libraries, rebuild third-party dependencies:
cd thirdparty
sh build-thirdparty.sh
cd ..
Debugging Apache Doris
Debugging Backend (BE)
Use GDB for debugging the BE component:
gdb --args be/output/bin/doris_be
Set breakpoints in GDB:
break be/src/olap/storage_engine.cpp:100
run
To attach to a running process:
ps aux | grep doris_be # Find the process ID
sudo gdb -p <PID>
Debugging Frontend (FE)
For Java-based FE debugging, use the following:
export JAVA_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005"
sh fe/bin/start_fe.sh
Then, attach a remote debugger to port 5005
.
Logging and Tracing
Modify log levels in conf/be.conf
or conf/fe.conf
to increase verbosity:
log_level=DEBUG
Restart Doris for the changes to take effect.
Using gperftools for Performance Profiling
git clone https://github.com/gperftools/gperftools.git
cd gperftools
./configure && make && sudo make install
Then run Doris with profiling enabled:
LD_PRELOAD=/usr/local/lib/libprofiler.so CPUPROFILE=/tmp/doris.prof ./be/output/bin/doris_be
Analyze the profile:
pprof --text ./be/output/bin/doris_be /tmp/doris.prof
Conclusion
Compiling and debugging Apache Doris from source requires meticulous attention to dependencies, system setup, and build procedures. This guide has covered all the fundamental steps, from cloning the repository and installing prerequisites to troubleshooting and debugging both the backend and frontend components.
By following these steps, you will not only gain a deeper understanding of how Apache Doris works internally but also be equipped to diagnose and fix issues efficiently. Debugging techniques such as using GDB, Java remote debugging, logging configurations, and performance profiling with gperftools ensure that you can monitor and optimize the database effectively.
For developers and engineers working with real-time analytical workloads, mastering the compilation and debugging process can significantly improve system stability, performance tuning, and development efficiency.
If you run into problems, don’t hesitate to consult the official Apache Doris documentation, reach out to the open-source community, or contribute to the project by reporting issues and suggesting improvements. With proper setup and debugging techniques, Apache Doris can be a powerful tool for handling massive-scale analytical queries with high performance.