How It Works

Home/How It Works

Data2Content, a semantic-based technology at the heart of innovation

Data2Content makes it possible to create large volumes of human-quality content from structured databases containing data such as products and their specifications or a topic and associated keywords.

Data2Content does not rewrite text. We start from a database that provides information (e.g., about a product) and we write content (e.g., product descriptions) using this information.

howitworks

Turn your database into content

designer drawing website development wireframeWhen a database is available, a description is generated for each product, using the related information that it contains. The above image illustrates the generation of descriptive texts from data related to shoe products.

Data2Content makes it possible not only to turn your database into meaningful content, but also to provide relevant and quality information that will help your customers better assess the strengths of your products. This is why, instead of simply describing each feature of a device, it is possible to select which ones should be showcased along with expert explanations.

Moreover, if necessary, Syllabs technologies can be used to enrich the databases used for content generation, either with data collected from the Web or via the extraction of information available in free text.

The semantic intelligence of Data2Content vs. content spinning

Content spinning makes it possible to rewrite the same article over and over by replacing specific words or phrases. The produced text is of very poor quality because pseudo synonyms are used to replace certain words, leading to approximations, or even nonsense.

To create content from large volumes of data, the following options are available:

  • rely on web editors
  • rely on marketplaces or offshore editors
  • use basic variabilization techniques, known as ‘content spinning’

The following table summarizes the strengths and weaknesses of each approach.

Criteria

  • Text Quality
  • Production Speed
  • Unique Content (SEO!)
  • Cost
  • Multilingualism

Web Editors

  • +++
  • --
  • --
  • ---
  • --

Offshore Editors

  • +
  • -
  • -
  • +
  • -

Content Spinning

  • ---
  • +++
  • ++
  • +++
  • -
These solutions should be compared according to 5 criteria:

  • Traditional content writers (journalists, editors) unquestionably deliver the best text quality. However, the problem lies in their limited ability to produce thousands of texts. A human, even with great writing skills, will find it hard to produce texts with a high level of variability when writing about very similar topics.
  • Marketplaces and offshore editors, when using low-cost solutions, usually produce text of mediocre quality. For higher-quality text, the cost will almost reach that of traditional content writers.
  • Content spinning produces results of very poor quality, with sentences that often do not make much sense, if any at all.
  • Syllabs offers human-quality content and this quality is consistent from one text to the next.

  • For content writers (whether traditional or not), it takes a lot of time to produce texts (a minimum of 15 min per text for somewhat low-quality results, which is already over 60 days of work for 2,000 texts).
  • On the other hand, automated methods produce almost instant results, as soon as the system has been configured. Setting up Data2Content can take from a few days to a few weeks in the case of highly complex projects requiring the collection of additional information (web mining, text mining). Once the configuration is complete, it only takes a few minutes to generate thousands of texts using the available data.

  • Producing a large number of texts of the same nature (descriptions of similar products) with a high level of writing variability is always challenging for a human content editor. Similar expressions can be found widely in produced texts, which can cause significant issues regarding content uniqueness, a highly valued principle for search engines such as Google.
  • All that content spinning does is modify some elements within a sentence or simply change their order, which seriously affects text quality. Additionally, the level of variability is very low compared to what Data2Content has to offer.
  • The technology used by Data2Content to create texts lets our linguists introduce variation contexts anywhere in the text, producing unique sentences in which similar segments are rarely reused.

  • When very large amounts of text are involved, traditional content editors are usually out of reach for most. Often, this also applies to offshore or marketplace content writers, especially if text quality is not to be compromised.
  • The cost of content spinning is very low but to the detriment of quality and with a high risk of being blacklisted by search engines like Google.
  • Data2Content offers very attractive prices, especially in cases where a large enough number of texts need to be generated. For instance, for a few thousand descriptions of same-type products, Data2Content offers unrivaled value for money.

  • Finding content editors in several languages is always difficult. Moreover, since thousands of texts need to be produced, it is necessary to find a large number of editors for each language.
  • Content spinning does not make it possible to turn existing text into a different language. Some use spam techniques involving machine translation followed by more content spinning (or the other way around). The resulting text quality is then close to zero.
  • Data2Content is currently available in 3 languages (English, Spanish and French). We can develop modules for additional languages in very little time. It is worth noting that we do not use machine translation to produce our texts in other languages, due to the very mediocre quality of texts produced by such services. Our multilingual content does indeed consist of new texts generated directly in the target language, using the same database as for the initial language.

Data2Content makes it possible to provide content of much higher quality than through content spinning (and with a much lower risk of being blacklisted by search engines), much more quickly than with human editors, with comparable and even higher quality (for large quantities of text), all at a much lower cost.