Introducing: DAT Linux PRO tools. Enhance your DAT Linux with extra power-tools including back-up/restore, app update notifications, app monitoring, custom links tab, dark theme, etc. One payment, perpetual license. Get PRO now!


DAT Linux is a Linux distribution for data science. It brings together all your favourite open-source data science tools and apps into a ready-to-run desktop environment. It’s based on Ubuntu 22.04, so it’s easy to install and use. The custom DAT Linux Control Panel provides a centralised one-stop-shop for running and managing dozens of data science programs. Read the FAQ.

📚️ Check out the DAT Linux curated list of free online data science e-books!

DAT Linux is perfect for students, professionals, academics, or anyone interested in data science who doesn’t want to spend endless hours downloading, installing, configuring, and maintaining applications from a range of sources, each with different technical requirements and set-up challenges.

👍 Recommend DAT Linux on DistroWatch

Get started:

Need a customised DAT Linux ISO for your school/college/university? Get your own branded, customised data science distro here.

List of supported data science apps:

💳️ Please subscribe/donate to help support DAT Linux development

App Description
BiRT Eclipse BIRT™ is an open source reporting system for producing compelling BI reports
ClickHouse ClickHouse is an open-source column-oriented DBMS for online analytical processing
Data Cleaner Data Quality toolkit that allows you to profile, correct, and enrich your data
Datasette Datasette is a tool for exploring and publishing data visually and with SQL
DB Browser DB Browser for SQLite is a visual, open source tool to create, design, and edit database files compatible with SQLite
DBeaver Free multi-platform database tool for developers, database administrators, analysts and all people who need to work with databases
Druid Apache Druid is a real-time database to power modern analytics applications
D-Search Convenient interface to the “webtools” R package to search for datasets in –all– CRAN packages
DuckDB DuckDB is an in-process SQL OLAP database management system
E-Git EGit is an Eclipse based GUI for the Git version control system
Emacs+ESS Emacs Speaks Statistics (ESS) is an add-on package for GNU Emacs to interact with statistical analysis programs such as R, S-Plus, SAS, Stata and OpenBUGS/JAGS
Gephi Gephi is the leading visualization and exploration software for all kinds of graphs and networks
Glue-viz Glue is a UI and Python library to explore relationships within and among related datasets
Gnumeric Gnumeric is a spreadsheet program that is part of the GNOME Free Software Desktop Project
GNU Plot gnuplot is a command-line and GUI program that can generate two- and three-dimensional plots of functions, data, and data fits
Grafana Grafana is a popular open-source platform for data visualization and monitoring
G-Vim A GUI wraper for the Vim screen-based text editor program, with plugins for R installed
IPython A command shell for interactive computing with a convenient console launcher
Julia Julia is a high-level, high-performance, dynamic programming language
Jupyter Notebook The Jupyter Notebook is a web-based interactive, scientific computing platform
Jupyter Lab JupyterLab is the latest web-based interactive development environment for notebooks, code, and data
KNIME KNIME Analytics Platform is open source software for data science
LabPlot Free, open source and cross-platform Data Visualization and Analysis software accessible to everyone
LibreOffice Calc LibreOffice Calc is the spreadsheet component of the LibreOffice software package
Luigi Luigi provides a framework to develop and manage data processing pipelines
Meld Meld is a visual file diff and merge tool
Metabase Metabase is an open-source business intelligence tool
MOA MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms
OpenRefine OpenRefine is an open-source desktop application for data cleanup and transformation to other formats
Orange Orange is a powerful platform to perform data analysis and visualization
Paraview ParaView is an open-source, multi-platform data analysis and visualization application
Pluto A Pluto notebook is made up of small blocks of Julia code (cells) and together they form a reactive notebook
PSPP GNU PSPP is a program for statistical analysis of sampled data. It is a free as in freedom replacement for the proprietary program SPSS
QGIS QGIS is a Free and Open Source Geographic Information System
Quarto Quarto® is an open-source scientific and technical publishing system built on Pandoc
R R is a free software environment for statistical computing and graphics
R-Studio RStudio is an Integrated Development Environment (IDE) for R
Scilab Scilab is a free and open-source, cross-platform numerical computational package and a high-level, numerically oriented programming language
Spyder Spyder is a free and open source scientific environment written in Python, for Python, and designed by and for scientists, engineers and data analysts
Superset Apache Superset is a modern, enterprise-ready business intelligence web application
Tabula Tabula is a free tool for extracting data from PDF files into CSV and Excel files
Veusz Veusz is a scientific plotting and graphing program with a graphical user interface, designed to produce publication-ready 2D and 3D plots
Visidata Visidata is an interactive multitool for tabular data. It combines the clarity of a spreadsheet, the efficiency of the terminal, and the power of Python, which can handle millions of rows with ease
VSCodium VSCodium is a community-driven, freely-licensed binary distribution of Microsoft’s editor VS Code (ready with plugins for R/RMarkdown, Python/Jupyter, Julia)
Weka Weka is a GUI and collection of machine learning algorithms for data mining tasks
WxMaxima wxMaxima is a document based interface for the computer algebra system Maxima
Zeppelin Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala, Python, R and more

NUMPY BY EXAMPLE - A Beginner's Guide to Learning NumPy by the DAT Linux team.
📖️ Read it free online
or 🛒️ BUY the PDF or EPUB e-book from Leanpub.