Sentence splitting

Splitting a text into a sequence of sentences


Sentence splitting is the process of converting a text into a sequence of sentences.

Sentence splitting is an essential step in machine translation to build parallel data that is aligned at the sentence level. Alignment at the sentence level is necessary in machine translation to make training efficient.

Common approaches to sentence splitting include rules such as regular expressions and machine learning models.

Sentence splitting is also called sentence segmentation or sentence boundary detection.


Want to learn more about Sentence splitting?


Edit this article →

Machine Translate is created and edited by contributors like you!

Learn more about contributing →

Licensed under CC-BY-SA-4.0.

Cite this article →