Article Extraction Schema

This model has been tailored based on customer feedback and usage. If you need a specific model and enough generalist, you can contact us on the support link below. If some fields are missing, you can also contact us to add them.

Contact us
Article Schema object
  • url string

    URL of the article

  • headline string

    Headline of the article

  • date_published string

    Published date of the article

  • date_published_raw string

    Raw published date of the article

  • date_modified string

    Modified date of the article

  • date_modified_raw string

    Raw modified date of the article

  • author string

    Author of the article

  • authors_list array

    List of authors

    Items object
    • author_name string

      Name of the author

  • language string

    Language of the article (ISO 639 code)

  • breadcrumbs array

    Breadcrumbs for navigation

    Items object
    • name string

      Name of the breadcrumb

    • link string

      Link of the breadcrumb

  • main_image string

    URL of the main image

  • images array

    List of image URLs extracted from the document

    Items object
    • image_url string

      URL of the image

  • guessed_topics array

    List of guessed topics

    Items string
    Items string
  • sentiment string

    Sentiment of the article

  • sentiment_probability number

    Probability of the sentiment from 0 to 1

  • description string

    Description of the article

  • article_body string

    Body of the article, with markdown links text only and spacing, punctuation fixed

  • article_body_html string

    HTML body of the article

  • video_urls array

    List of video URLs

    Items object
    • video_url string

      URL of the video

  • audio_urls array

    List of audio URLs

    Items object
    • audio_url string

      URL of the audio

  • related_articles array

    List of related articles

    Items object
    • headline string

      Headline of the related article

    • description string

      Description of the related article

    • url string

      URL of the related article

  • canonical_url string

    Canonical URL of the article

  • corpus array

    Structured content of the article

    Items object
    • type string

      Type of the content segment

    • content string

      Content of the segment