Methods and systems for detecting tables within documents are provided. The methods and systems may include receiving a text of the document that includes a plurality of words depicted in the document image. Feature sets may be calculated for the words and may contain one or more features of a corresponding word of the text. Candidate table words may then be identified based on the features vectors, and may then be used to identify a table location within the document image. In some cases, the candidate table words may be identified using a machine learning model.
G06V 30/412 - Analyse de mise en page de documents structurés avec des lignes imprimées ou des zones de saisie, p. ex. de formulaires ou de tableaux d’entreprise
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06V 30/414 - Extraction de la structure géométrique, p. ex. arborescenceDécoupage en blocs, p. ex. boîtes englobantes pour les éléments graphiques ou textuels
2.
Text line image splitting with different font sizes
A method for splitting text line images includes receiving a text line image and identifying that the text line image comprises a plurality of zones, wherein each zone includes text whose font differs from the text of adjacent zones. The method further includes selecting a splitting position between multiple zones and splitting the text line image at the splitting position into a plurality of image segments, wherein each image segment contains at least one zone of the text line image and performing optical character recognition on each image segment to recognize a text segment of the image segment. In certain implementations, the method further includes generating one or more confidence measurements and selecting a splitting position that corresponds to a large gradient in the confidence measurement.
G06V 30/144 - Acquisition d’images en utilisant une fente déplacée sur la surface de l’imageAcquisition d’images en utilisant des détecteurs particuliers en des points prédéterminésAcquisition d’images en utilisant des moyens automatiques suiveurs de courbes
G06F 40/284 - Analyse lexicale, p. ex. segmentation en unités ou cooccurrence
G06V 30/414 - Extraction de la structure géométrique, p. ex. arborescenceDécoupage en blocs, p. ex. boîtes englobantes pour les éléments graphiques ou textuels
Methods and systems for recognizing named entities within the text of a document are provided. The methods and systems may include receiving a document image and recognized text of the document image. A feature map of the document image may be created, a tagged map may be created, and locations of tags within the tagged map may be estimated using a machine learning model. Named entities with the recognized text may be recognized based on the one or more locations of the tags. In some embodiments, the machine learning model is a convolutional neural network. In further embodiments, creating the feature map may include determining, for a subset of the cells of the feature map, one or more features of the recognized text contained in a corresponding portion of the document image.
G06V 30/412 - Analyse de mise en page de documents structurés avec des lignes imprimées ou des zones de saisie, p. ex. de formulaires ou de tableaux d’entreprise
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
G06V 30/416 - Extraction de la structure logique, p. ex. chapitres, sections ou numéros de pageIdentification des éléments de document, p. ex. des auteurs
G06V 30/414 - Extraction de la structure géométrique, p. ex. arborescenceDécoupage en blocs, p. ex. boîtes englobantes pour les éléments graphiques ou textuels
G06V 30/18 - Extraction d’éléments ou de caractéristiques de l’image
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06F 18/214 - Génération de motifs d'entraînementProcédés de Bootstrapping, p. ex. ”bagging” ou ”boosting”
G06F 18/21 - Conception ou mise en place de systèmes ou de techniquesExtraction de caractéristiques dans l'espace des caractéristiquesSéparation aveugle de sources
G06V 30/19 - Reconnaissance utilisant des moyens électroniques
G06V 20/62 - Texte, p. ex. plaques d’immatriculation, textes superposés ou légendes des images de télévision
A method for estimating text heights of text line images includes estimating a text height with a sequence recognizer. The method further includes normalizing a vertical dimension and/or position of text within a text line image based on the text height. The method may also further include calculating a feature of the text line image. In some examples, the sequence recognizer estimates the text height with a machine learning model.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/24 - Alignement, centrage, détection de l’orientation ou correction de l’image
G06V 10/50 - Extraction de caractéristiques d’images ou de vidéos en effectuant des opérations dans des blocs d’imagesExtraction de caractéristiques d’images ou de vidéos en utilisant des histogrammes, p. ex. l’histogramme de gradient orienté [HoG]Extraction de caractéristiques d’images ou de vidéos en utilisant l’addition des valeurs d’intensité d’imageAnalyse de projection
5.
Post-filtering of named entities with machine learning
A method for identifying errors associated with named entity recognition includes recognizing a candidate named entity within a text and extracting a chunk from the text containing the candidate named entity. The method further includes creating a feature vector associated with the chunk and analyzing the feature vector for an indication of an error associated with the candidate named entity. The method also includes correcting the error associated with the candidate named entity.
G06F 18/21 - Conception ou mise en place de systèmes ou de techniquesExtraction de caractéristiques dans l'espace des caractéristiquesSéparation aveugle de sources
G06V 30/19 - Reconnaissance utilisant des moyens électroniques
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
Systems and methods for transforming data between multiple styles are provided. In one embodiment, a system is provided that includes a generator model, a discriminator model, and a preserver model. The generator model may be configured to receive data in a first style and generate converted data in a second style. The discriminator model may be configured to receive the converted data from the generator model, compare the converted data to original data in the second style, and compute a resemblance measure based on the comparison. The preserver model may be configured to receive the converted data from the generator model and compute an information measure of the converted data. The generator model may also be trained to optimize the resemblance measure and the information measure.
Systems and methods for transforming data between multiple styles are provided. In one embodiment, a system is provided that includes a generator model, a discriminator model, and a preserver model. The generator model may be configured to receive data in a first style and generate converted data in a second style. The discriminator model may be configured to receive the converted data from the generator model, compare the converted data to original data in the second style, and compute a resemblance measure based on the comparison. The preserver model may be configured to receive the converted data from the generator model and compute an information measure of the converted data. The generator model may also be trained to optimize the resemblance measure and the information measure.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/62 - Texte, p. ex. plaques d’immatriculation, textes superposés ou légendes des images de télévision
G06V 30/19 - Reconnaissance utilisant des moyens électroniques
G06V 30/413 - Classification de contenu, p. ex. de textes, de photographies ou de tableaux
Methods and systems for detecting tables within documents are provided. The methods and systems may include receiving a text of the document that includes a plurality of words depicted in the document image. Feature sets may be calculated for the words and may contain one or more features of a corresponding word of the text. Candidate table words may then be identified based on the features vectors, and may then be used to identify a table location within the document image. In some cases, the candidate table words may be identified using a machine learning model.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/36 - Prétraitement de l'image, c. à d. traitement de l'information image sans se préoccuper de l'identité de l'image
G06K 9/46 - Extraction d'éléments ou de caractéristiques de l'image
Methods and systems for detecting tables within documents are provided. The methods and systems may include receiving a text of the document that includes a plurality of words depicted in the document image. Feature sets may be calculated for the words and may contain one or more features of a corresponding word of the text. Candidate table words may then be identified based on the features vectors, and may then be used to identify a table location within the document image. In some cases, the candidate table words may be identified using a machine learning model.
G06V 30/412 - Analyse de mise en page de documents structurés avec des lignes imprimées ou des zones de saisie, p. ex. de formulaires ou de tableaux d’entreprise
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06V 30/414 - Extraction de la structure géométrique, p. ex. arborescenceDécoupage en blocs, p. ex. boîtes englobantes pour les éléments graphiques ou textuels
10.
NAMED ENTITY RECOGNITION WITH CONVOLUTIONAL NETWORKS
Methods and systems for recognizing named entities within the text of a document are provided. The methods and systems may include receiving a document image and recognized text of the document image. A feature map of the document image may be created, a tagged map may be created, and locations of tags within the tagged map may be estimated using a machine learning model. Named entities with the recognized text may be recognized based on the one or more locations of the tags. In some embodiments, the machine learning model is a convolutional neural network. In further embodiments, creating the feature map may include determining, for a subset of the cells of the feature map, one or more features of the recognized text contained in a corresponding portion of the document image.
Methods and systems for recognizing named entities within the text of a document are provided. The methods and systems may include receiving a document image and recognized text of the document image. A feature map of the document image may be created, a tagged map may be created, and locations of tags within the tagged map may be estimated using a machine learning model. Named entities with the recognized text may be recognized based on the one or more locations of the tags. In some embodiments, the machine learning model is a convolutional neural network. In further embodiments, creating the feature map may include determining, for a subset of the cells of the feature map, one or more features of the recognized text contained in a corresponding portion of the document image.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
A method for splitting text line images includes receiving a text line image and identifying that the text line image comprises a plurality of zones, wherein each zone includes text whose font differs from the text of adjacent zones. The method further includes selecting a splitting position between multiple zones and splitting the text line image at the splitting position into a plurality of image segments, wherein each image segment contains at least one zone of the text line image and performing optical character recognition on each image segment to recognize a text segment of the image segment. In certain implementations, the method further includes generating one or more confidence measurements and selecting a splitting position that corresponds to a large gradient in the confidence measurement.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/34 - Découpage des formes se touchant ou se chevauchant dans la zone image
G06F 40/284 - Analyse lexicale, p. ex. segmentation en unités ou cooccurrence
13.
TEXT LINE IMAGE SPLITTING WITH DIFFERENT FONT SIZES
A method for splitting text line images includes receiving a text line image and identifying that the text line image comprises a plurality of zones, wherein each zone includes text whose font differs from the text of adjacent zones. The method further includes selecting a splitting position between multiple zones and splitting the text line image at the splitting position into a plurality of image segments, wherein each image segment contains at least one zone of the text line image and performing optical character recognition on each image segment to recognize a text segment of the image segment. In certain implementations, the method further includes generating one or more confidence measurements and selecting a splitting position that corresponds to a large gradient in the confidence measurement.
G06K 9/18 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales utilisant des caractères imprimés pourvus de marques de codage additionnelles ou comportant des marques de codage, p.ex. le caractère étant composé de barres distinctes de formes différentes, chacune représentant une valeur de code différente
G06K 9/36 - Prétraitement de l'image, c. à d. traitement de l'information image sans se préoccuper de l'identité de l'image
G06K 9/58 - Prétraitement de l'image, c. à d. traitement de l'information image sans se préoccuper de l'identité de l'image en utilisant des moyens optiques
A method for estimating text heights of text line images includes estimating a text height with a sequence recognizer. The method further includes normalizing a vertical dimension and/or position of text within a text line image based on the text height. The method may also further include calculating a feature of the text line image. In some examples, the sequence recognizer estimates the text height with a machine learning model.
G06K 9/34 - Découpage des formes se touchant ou se chevauchant dans la zone image
G06K 9/68 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques utilisant des comparaisons successives des signaux images avec plusieurs références, p.ex. mémoire adressable
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/42 - Normalisation des dimensions de la forme
A method for estimating text heights of text line images includes estimating a text height with a sequence recognizer. The method further includes normalizing a vertical dimension and/or position of text within a text line image based on the text height. The method may also further include calculating a feature of the text line image. In some examples, the sequence recognizer estimates the text height with a machine learning model.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/32 - Alignement ou centrage du capteur d'image ou de la zone image
16.
POST-FILTERING OF NAMED ENTITIES WITH MACHINE LEARNING
A method for identifying errors associated with named entity recognition includes recognizing a candidate named entity within a text and extracting a chunk from the text containing the candidate named entity. The method further includes creating a feature vector associated with the chunk and analyzing the feature vector for an indication of an error associated with the candidate named entity. The method also includes correcting the error associated with the candidate named entity.
A method for identifying errors associated with named entity recognition includes recognizing a candidate named entity within a text and extracting a chunk from the text containing the candidate named entity. The method further includes creating a feature vector associated with the chunk and analyzing the feature vector for an indication of an error associated with the candidate named entity. The method also includes correcting the error associated with the candidate named entity.