References → All Data Studio Recipes

The Data Studio is a rapid low-code/no-code MV development enviroment. With the Data Studio, developers can connect a series of analytical steps (Recipes) to develop an enriched dataset. Once the analytics steps are defined in the Data Studio, the developer will output the Dataflow as pySpark into a Physical Schema of their choice.

Activating the Data Studio

To activate the data studio, please refer to this guide.

Recipe Index

Recipes are foundational tools to performing data quality checks, cleansing, analysis, blending and much more.

Content Transformation

Change Type - Change the data type of column(s).
Filter - Remove records from a dataset based on a condition.
Select - Select which columns to keep or remove from a dataset.
Sort - Sort data within a dataset.
Unpivot - Transpose your dataset based into columns and values.
Formula - Add customer logic to create a new calculated field.
Sample - Select a subset of records within your dataset.
Aggregation - Aggregate youdata set and set granularity through 'group by' logic.
Rename - Rename column labels in your dataset.
Split - Split the dataset into two datasets.

Structure Transformation

Join - Join two datasets based on a set of join logic.
Union - Union two datasets together.

Data Quality and Validation

Fuzzy Join - Cleanse data through providing a lookup table.
Data Quality - Write a set of conditions and flag any violations in your dataset.

Advanced Querying

LLM - Unleash the power of AI in your Dataflow.
Python - Inject custom pySpark into your Dataflow.
SQL - Inject custom SQL into your data Dataflow.

Deploy and Eject Operations

Save MV - Save your data flow output to a Materialized View.