References → Python Recipe

The Python recipe is an excellent way to inject PySpark into your dataflow. This can be used for simple or complex operations that are not currently available as a recipe.

Configuration

ConfigurationDescription
Recipe NameA freeform name of how a user would like to name a recipe
InputSelect a previously constructed recipes. The code tool can ingest multiple inputs.
Change Column TypeEnter PySpark code to run within the recipe

Or, use Incorta Nexus to generate an intelligent code suggestion:
  ●   Enter the description or desired outcome.
  ●   Select the Incorta Nexus icon next to the description field.
  ●   View the generated PySpark code based on your input in the description field.

Important pySpark commands

Input Data

The inputs assigned in the Multiple Inputs setting will populate a new configuration section called Input Variable Name. This will showcase the true name of the underlying data frame. Users can copy these variable names as input data frames within their code.

Output Data

For each Code recipe, the code must contain output as a data frame for the recipe to save properly. Please use the function output_df() to save the data frame. Example: output_df(my_sample_df)