Apache Doris is a high-performance, real-time analytical database based on MPP (Massively Parallel Processing) architecture. If you need to build and debug Apache Doris from source, this guide provides a step-by-step approach, including setup, dependencies, troubleshooting, and debugging techniques.

Prerequisites

Before compiling Apache Doris from source, ensure you have the required dependencies and system setup:

  • Operating System: Ubuntu 20.04+ / CentOS 7+ / macOS
  • CPU: x86-64 or ARM64
  • Memory: At least 16GB RAM (recommended 32GB for large builds)
  • Disk Space: 50GB free (to accommodate build artifacts and dependencies)
  • Development Tools:
    • GCC (>= 8.3.0) or Clang
    • CMake (>= 3.13)
    • Maven (>= 3.6)
    • JDK (>= 8, recommended JDK 11)
    • Python (>= 3.6)
    • Node.js (for building frontend UI)

Ensure you install these dependencies before proceeding.

Cloning the Source Code

To get the latest stable or development version of Apache Doris, clone the repository from GitHub:

# Clone the Apache Doris repository
git clone --recursive https://github.com/apache/doris.git
cd doris

Make sure to fetch all submodules with --recursive to avoid missing dependencies.

Installing Required Dependencies

Use the following commands to install dependencies on different platforms:

Ubuntu

sudo apt update
sudo apt install -y cmake gcc g++ libtool automake flex bison \
    pkg-config curl unzip git python3 python3-pip openjdk-11-jdk \
    maven nodejs

CentOS

sudo yum groupinstall -y "Development Tools"
sudo yum install -y epel-release cmake3 gcc gcc-c++ libtool \
    automake flex bison pkg-config curl unzip git python3 \
    python3-pip java-11-openjdk maven nodejs

Compiling Apache Doris

Apache Doris consists of three major components:

  1. BE (Backend) – The core processing engine
  2. FE (Frontend) – Query parsing and optimization layer
  3. Third-party libraries – Additional dependencies

Step 1: Building Third-Party Libraries

Before compiling Doris, build the third-party dependencies:

cd thirdparty
sh build-thirdparty.sh
cd ..

This step downloads and compiles necessary libraries, such as Boost, Thrift, and Protobuf.

Step 2: Compiling the Backend (BE)

Compile the BE using the following commands:

sh build.sh --be

This process may take 10-30 minutes depending on your system.

Step 3: Compiling the Frontend (FE)

To build the frontend (FE):

sh build.sh --fe

If everything compiles successfully, the binaries will be located in the output directory.

Running Apache Doris

Once compiled, start both BE and FE:

# Start Frontend
sh fe/bin/start_fe.sh --daemon

# Start Backend
sh be/bin/start_be.sh --daemon

Check the logs in log/ directory for any startup issues.

Troubleshooting Common Issues

1. Compilation Errors

If you encounter compilation issues, try the following:

  • Ensure you have the correct versions of dependencies.
  • Clean previous builds and retry:
sh build.sh clean
  • Check the error logs for missing packages or incompatible versions.

2. Service Fails to Start

Check the logs:

tail -f fe/log/fe.log

Ensure that ports (default: 8030 for FE, 9050 for BE) are not occupied by other processes.

3. Missing Third-Party Libraries

If the build process complains about missing libraries, rebuild third-party dependencies:

cd thirdparty
sh build-thirdparty.sh
cd ..

Debugging Apache Doris

Debugging Backend (BE)

Use GDB for debugging the BE component:

gdb --args be/output/bin/doris_be

Set breakpoints in GDB:

break be/src/olap/storage_engine.cpp:100
run

To attach to a running process:

ps aux | grep doris_be  # Find the process ID
sudo gdb -p <PID>

Debugging Frontend (FE)

For Java-based FE debugging, use the following:

export JAVA_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005"
sh fe/bin/start_fe.sh

Then, attach a remote debugger to port 5005.

Logging and Tracing

Modify log levels in conf/be.conf or conf/fe.conf to increase verbosity:

log_level=DEBUG

Restart Doris for the changes to take effect.

Using gperftools for Performance Profiling

git clone https://github.com/gperftools/gperftools.git
cd gperftools
./configure && make && sudo make install

Then run Doris with profiling enabled:

LD_PRELOAD=/usr/local/lib/libprofiler.so CPUPROFILE=/tmp/doris.prof ./be/output/bin/doris_be

Analyze the profile:

pprof --text ./be/output/bin/doris_be /tmp/doris.prof

Conclusion

Compiling and debugging Apache Doris from source requires meticulous attention to dependencies, system setup, and build procedures. This guide has covered all the fundamental steps, from cloning the repository and installing prerequisites to troubleshooting and debugging both the backend and frontend components.

By following these steps, you will not only gain a deeper understanding of how Apache Doris works internally but also be equipped to diagnose and fix issues efficiently. Debugging techniques such as using GDB, Java remote debugging, logging configurations, and performance profiling with gperftools ensure that you can monitor and optimize the database effectively.

For developers and engineers working with real-time analytical workloads, mastering the compilation and debugging process can significantly improve system stability, performance tuning, and development efficiency.

If you run into problems, don’t hesitate to consult the official Apache Doris documentation, reach out to the open-source community, or contribute to the project by reporting issues and suggesting improvements. With proper setup and debugging techniques, Apache Doris can be a powerful tool for handling massive-scale analytical queries with high performance.