Configure Extraction
[STAGE 5: Configuration] Persist a tested extraction pipeline on a task (new or existing). Accepts normalized steps (jsonConditions supported). Returns examples, textual next_steps, and typed next_tools suggestions to continue testing and validation.
Parameters
- Required
job_id
string
Monitoring job ID
task_id
string
Task ID to update (leave empty to create new task)
task_name
string
Name for the task (required for new tasks)
data_type
string
Optional semantic label for the data (e.g., 'News article', 'Job posting', 'Price'). Used to derive a Data Type when none provided.
data_type_id
string
Optional: Explicit Data Type ID to use. When omitted, the tool infers a default type from steps/value type.
- Required
steps
array
Extraction steps. Examples: - Single value: [{"method":"xPath","pattern":"//title"}] - Multiple items + transform: [{"method":"xPath_extract","pattern":"//div[@class=\"item\"]"},{"method":"json","pattern":null,"jsonConditions":[{"function":"transform","key":"{\"title\":\"${ @step | xPath : \'//h2\' }\"}"}]}]
Array items
[item]
object
Nested fields
method
string (xPath, xPath_extract, json, regex, gpt)
xPath, xPath_extract, json, regex, or gpt
Allowed values: xPath, xPath_extract, json, regex, gpt
pattern
string
Extraction pattern. For json, set to null when providing jsonConditions.
jsonConditions
string
Optional stringified conditions for json step (filters, transforms). Will be decoded/normalized.
test_first
boolean
Test extraction before saving (recommended)