Article Extraction Schema
This model has been tailored based on customer feedback and usage. If you need a specific model and enough generalist, you can contact us on the support link below. If some fields are missing, you can also contact us to add them.
Contact us
Article Schema object
-
url
string
URL of the article
-
headline
string
Headline of the article
-
date_published
string
Published date of the article
-
date_published_raw
string
Raw published date of the article
-
date_modified
string
Modified date of the article
-
date_modified_raw
string
Raw modified date of the article
-
author
string
Author of the article
-
authors_list
array
List of authors
Items object
-
author_name
string
Name of the author
-
author_name
string
-
language
string
Language of the article (ISO 639 code)
-
breadcrumbs
array
Breadcrumbs for navigation
Items object
-
name
string
Name of the breadcrumb
-
link
string
Link of the breadcrumb
-
name
string
-
main_image
string
URL of the main image
-
images
array
List of image URLs extracted from the document
Items object
-
image_url
string
URL of the image
-
image_url
string
-
guessed_topics
array
List of guessed topics
Items string
Items string
-
sentiment
string
Sentiment of the article
-
sentiment_probability
number
Probability of the sentiment from 0 to 1
-
description
string
Description of the article
-
article_body
string
Body of the article, with markdown links text only and spacing, punctuation fixed
-
article_body_html
string
HTML body of the article
-
video_urls
array
List of video URLs
Items object
-
video_url
string
URL of the video
-
video_url
string
-
audio_urls
array
List of audio URLs
Items object
-
audio_url
string
URL of the audio
-
audio_url
string
-
related_articles
array
List of related articles
Items object
-
headline
string
Headline of the related article
-
description
string
Description of the related article
-
url
string
URL of the related article
-
headline
string
-
canonical_url
string
Canonical URL of the article
-
corpus
array
Structured content of the article
Items object
-
type
string
Type of the content segment
-
content
string
Content of the segment
-
type
string