DataOS PyFlare¶
The DataOS PyFlare is a Python library that streamlines data operations and facilitate seamless interactions with Apache Spark within DataOS. It's a wrapper around Flare, to enable Python support with DataOS capabilities. The library abstracts complexities inherent in data flow, allowing users to direct their focus toward data transformations and the formulation of business logic by simplifying the loading, transformation, and storage of data. It facilitates the integration of existing Spark Job code bases with DataOS, requiring minimal modifications.
Key features¶
Streamlined Data Operations¶
PyFlare offers a unified interface for data loading, transformation, and storage, thereby significantly reducing development intricacies and accelerating the project timeline.
Data Connector Integration¶
It seamlessly integrates with various data connectors, including depot and non-depot sources, by leveraging the SDK's built-in capabilities.
Customizability and Extensibility¶
PyFlare empowers users with the flexibility to tailor it to their specific project requirements. It seamlessly integrates with existing Python libraries and frameworks designed for data manipulation.
Optimized for DataOS¶
PyFlare is finely tuned for the DataOS platform, rendering it an ideal choice for the management and processing of data within DataOS environments.
Installation¶
By default, the DataOS environment does not include support for the DataOS-native Jupyter Notebook. However, it can be integrated into the environment on a requirement basis. The PyFlare module is compatible with Jupyter Notebooks and can also be utilized in various Python programs across different environments.
Prerequisites¶
This section describes the steps to follow before installing PyFlare.
Ensure you have Python β₯ 3.7 Installed
Prior to installation, ensure that you have Python 3.7 and above installed on your system. You can check the Python version by running:
For Linux/macOS
For Windows
If you do not have Python, please install the latest version from python.org.
Note: If youβre using an enhanced shell like IPython or Jupyter notebook, you can run system commands by prefacing them with aΒ
!
Β character:
Ensure you have pip
installed
Additionally, youβll need to make sure you have pip available. You can check this by running:
For Linux/macOS
For Windows
If you installed Python from source, with an installer fromΒ python.org, or viaΒ HomebrewΒ you should already have pip. If youβre on Linux and installed using your OS package manager, you may have to install pip separately, seeΒ Installing pip/setuptools/wheel with Linux Package Managers.
Installing from PyPI¶
The dataos-pyflare
library can be installed from the Python Package Index (PyPI) using the following command:
Recommendation
Install the dataos-pyflare==0.1.13
version of PyFlare, as it is the designated stable release.
For Linux/macOS
For Windows
Note: If youβre using an enhanced shell like IPython or Jupyter notebook, you must restart the runtime in order to use the newly installed package.
Install from Source Distribution¶
pipΒ can install from eitherΒ Source Distributions (sdist)Β orΒ Wheels, but if both are present on PyPI, pip will prefer a compatibleΒ wheel.
IfΒ pip
Β does not find a wheel to install, it will locally build a wheel and cache it for future installs, instead of rebuilding the source distribution in the future.
Supported sources¶
Additional links¶
PyFlare library reference¶
For a comprehensive reference guide to PyFlare, including detailed information on its modules and classes, please consult the PyFlare Library Reference.