# Step Configuration ④ - Detail Step ## Configuring Detail Steps (`detail_step`) In the `detail_step` of a Study, you focus on obtaining more in-depth information from the detailed pages corresponding to each link extracted from the list page. Correctly configuring the `detail_step` is crucial for building a complete dataset and analysis. ### Step Description The `detail_step` phase typically follows the [`list_step` ](Jexter%20Configuration:Extract%20Page%20Information%20in%20the%20list_step%20.md) phase closely, where it is responsible for extracting specific data from the detailed page of each list item. This data may include, but is not limited to: - Task ID (`dp2_id`) - Company name (`company`) - Product name (`drug_name`) - Approval number (`auth_num`) - Product specifications (`specification`) - Product description (`drug_reference`) - Attachments (`attachments`) ### Example Configuration ```json { "data_in": { "data_for_test": [ { "dp2_id": 12345678, "product_name": "Example Medication Name", "product_link": "https://www.examplepharm.com/product-detail?id=12345" } ] }, "project_name": "examplepharm.drugs.detail", "url": "{product_link}", "type": "one-off", "priority": 2, "fetch_method": "direct", "method": "GET", "status": 1, "charset": "UTF-8", "charact_string_start": "", "charact_string_end": "", "add_only": 1, "excluded_workers": "{excluded_workers}", "interval": 5184000, "data_out": { "jpath": "", "api": { "url": "http://api2.example.cn/dp2/mongo/save", "table": "company.examplepharm.drugs", "type": "merge", "where": { "uniqueId": "{dp2_id}" } } } } ``` In this configuration: - `data_in` contains the product information passed from the [`list_step` ](Jexter%20Configuration:Extract%20Page%20Information%20in%20the%20list_step%20.md) phase, including `dp2_id`, `product_name`, and `product_link`. Here, `12345678` is used as an example `dp2_id`, "Example Medication Name" as the product name, and `https://www.examplepharm.com/product-detail?id=12345` as the product link. - `project_name` defines the name of the current Study, here using `examplepharm.drugs.detail` as an example. - The `url` field uses the `{product_link}` placeholder, representing the URL of the detailed page. - Fields such as `type`, `priority`, `fetch_method`, `method`, etc., define the type and priority of the request. - The `data_out` field contains the details of the API call for saving the extracted data to the database. Here, the `merge` type is used, indicating that new data is merged into existing records. - The `add_only` field is set to 1, meaning that if a record already exists, it will not be updated. ### Considerations - Ensure that the links in `data_in` are valid and correctly point to the detailed pages. - In `data_out`, use placeholders (e.g., `{dp2_id}`) to represent the data extracted from the detailed page, which will be replaced by the actual data during the extraction process. - If the detailed page contains dynamically loaded content, you may need to adjust the `interval` field to give the page enough time to load all content. Perform comprehensive **testing** of your configuration before deployment to ensure accuracy and prevent errors. Properly configuring the `detail_step` is crucial for the success of the [next extraction phase](Jexter%20Configuration:Extracting%20Drug%20Information%20in%20'detail_step'.md), ensuring that the right drug links are extracted and their details are [accurately captured and stored](API%20Configuration%20Guide%20in%20DP2.md).