Different types of media experiences can be developed based on characteristics of the consumer. “Linear” experiences may require execution of a pre-built script, although the script could be dynamically modified by a media production platform. Linear experiences can include guided audio tours that are modified or updated based on the location of the consumer. “Enhanced” experiences include conventional media content that is supplemented with intelligent media content. For example, turn-by-turn directions could be supplemented with audio descriptions about the surrounding area. “Freeform” experiences, meanwhile, are those that can continually morph based on information gleaned from a consumer. For example, a radio station may modify what content is being presented based on the geographical metadata uploaded by a computing device associated with the consumer.
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p. ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
G06F 3/04817 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] fondées sur des propriétés spécifiques de l’objet d’interaction affiché ou sur un environnement basé sur les métaphores, p. ex. interaction avec des éléments du bureau telles les fenêtres ou les icônes, ou avec l’aide d’un curseur changeant de comportement ou d’aspect utilisant des icônes
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
2.
UPSAMPLING OF AUDIO USING GENERATIVE ADVERSARIAL NETWORKS
Introduced here are approaches to training and then employing computer-implemented models designed to upsample discrete audio signals to higher sampling rates. Assume, for example, that a media production platform obtains a first discrete signal at a relatively low sampling rate. The relatively low sampling frequency may make the first discrete audio signal unsuitable for inclusion in media compilations, so the media production platform may attempt to improve its quality through upsampling. To accomplish this, the media production platform can apply a transform to the first discrete signal to produce a first magnitude spectrogram. Then, the media production platform can apply a computer-implemented model to the first magnitude spectrogram to produce a second magnitude spectrogram. Thereafter, the media production platform can apply an inverse transform to the second magnitude spectrogram to create a second discrete signal that has a higher sampling rate than the first discrete audio signal.
G10L 25/18 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information spectrale de chaque sous-bande
G06N 3/088 - Apprentissage non supervisé, p. ex. apprentissage compétitif
G10L 19/02 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p. ex. dans les vocodeursCodage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique utilisant l'analyse spectrale, p. ex. vocodeurs à transformée ou vocodeurs à sous-bandes
G10L 25/30 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
3.
TRAINING GENERATIVE ADVERSARIAL NETWORKS TO UPSAMPLE AUDIO
Introduced here are approaches to training and then employing computer-implemented models designed to upsample discrete audio signals to higher sampling rates. Assume, for example, that a media production platform obtains a first discrete signal at a relatively low sampling rate. The relatively low sampling frequency may make the first discrete audio signal unsuitable for inclusion in media compilations, so the media production platform may attempt to improve its quality through upsampling. To accomplish this, the media production platform can apply a transform to the first discrete signal to produce a first magnitude spectrogram. Then, the media production platform can apply a computer-implemented model to the first magnitude spectrogram to produce a second magnitude spectrogram. Thereafter, the media production platform can apply an inverse transform to the second magnitude spectrogram to create a second discrete signal that has a higher sampling rate than the first discrete audio signal.
G10L 25/18 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information spectrale de chaque sous-bande
G06N 3/088 - Apprentissage non supervisé, p. ex. apprentissage compétitif
G10L 19/02 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p. ex. dans les vocodeursCodage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique utilisant l'analyse spectrale, p. ex. vocodeurs à transformée ou vocodeurs à sous-bandes
G10L 25/30 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
4.
FILLER WORD DETECTION THROUGH TOKENIZING AND LABELING OF TRANSCRIPTS
Introduced here are computer programs and associated computer-implemented techniques for discovering the presence of filler words through tokenization of a transcript derived from audio content. When audio content is obtained by a media production platform, the audio content can be converted into text content as part of a speech-to-text operation. The text content can then be tokenized and labeled using a Natural Language Processing (NLP) library. Tokenizing/labeling may be performed in accordance with a series of rules associated with filler words. At a high level, these rules may examine the text content (and associated tokens/labels) to determine whether patterns, relationships, verbatim, and context indicate that a term is a filler word. Any filler words that are discovered in the text content can be identified as such so that appropriate action(s) can be taken.
Introduced here are computer programs and associated computer-implemented techniques for facilitating the creation of a master transcription (or simply “transcript”) that more accurately reflects underlying audio by comparing multiple independently generated transcripts. The master transcript may be used to record and/or produce various forms of media content, as further discussed below. Thus, the technology described herein may be used to facilitate editing of text content, audio content, or video content. These computer programs may be supported by a media production platform that is able to generate the interfaces through which individuals (also referred to as “users”) can create, edit, or view media content. For example, a computer program may be embodied as a word processor that allows individuals to edit voice-based audio content by editing a master transcript, and vice versa.
Introduced here are computer programs and associated computer-implemented techniques for facilitating the creation of a master transcription (or simply “transcript”) that more accurately reflects underlying audio by comparing multiple independently generated transcripts. The master transcript may be used to record and/or produce various forms of media content, as further discussed below. Thus, the technology described herein may be used to facilitate editing of text content, audio content, or video content. These computer programs may be supported by a media production platform that is able to generate the interfaces through which individuals (also referred to as “users”) can create, edit, or view media content. For example, a computer program may be embodied as a word processor that allows individuals to edit voice-based audio content by editing a master transcript, and vice versa.
Introduced here are computer programs and associated computer-implemented techniques for discovering the presence of filler words through tokenization of a transcript derived from audio content. When audio content is obtained by a media production platform, the audio content can be converted into text content as part of a speech-to-text operation. The text content can then be tokenized and labeled using a Natural Language Processing (NLP) library. Tokenizing/labeling may be performed in accordance with a series of rules associated with filler words. At a high level, these rules may examine the text content (and associated tokens/labels) to determine whether patterns, relationships, verbatim, and context indicate that a term is a filler word. Any filler words that are discovered in the text content can be identified as such so that appropriate action(s) can be taken.
Introduced here are computer programs and associated computer-implemented techniques for discovering the presence of filler words through tokenization of a transcript derived from audio content. When audio content is obtained by a media production platform, the audio content can be converted into text content as part of a speech-to-text operation. The text content can then be tokenized and labeled using a Natural Language Processing (NLP) library. Tokenizing/labeling may be performed in accordance with a series of rules associated with filler words. At a high level, these rules may examine the text content (and associated tokens/labels) to determine whether patterns, relationships, verbatim, and context indicate that a term is a filler word. Any filler words that are discovered in the text content can be identified as such so that appropriate action(s) can be taken.
Introduced here are computer programs and associated computer-implemented techniques for manipulating noisy audio signals to produce clean audio signals that are sufficiently high quality so as to be largely, if not entirely, indistinguishable from “rich” recordings generated by recording studios. When a noisy audio signal is obtained by a media production platform, the noisy audio signal can be manipulated to sound as if recording occurred with sophisticated equipment in a soundproof environment. Manipulation can be performed by a model that, when applied to the noisy audio signal, can manipulate its characteristics so as to emulate the characteristics of clean audio signals that are learned through training.
G10L 21/0232 - Traitement dans le domaine fréquentiel
G10L 25/30 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
G10L 25/18 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information spectrale de chaque sous-bande
G10L 15/06 - Création de gabarits de référenceEntraînement des systèmes de reconnaissance de la parole, p. ex. adaptation aux caractéristiques de la voix du locuteur
G10L 25/21 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information sur la puissance
10.
APPROACHES TO GENERATING STUDIO-QUALITY RECORDINGS THROUGH MANIPULATION OF NOISY AUDIO
Introduced here are computer programs and associated computer-implemented techniques for manipulating noisy audio signals to produce clean audio signals that are sufficiently high quality so as to be largely, if not entirely, indistinguishable from “rich” recordings generated by recording studios. When a noisy audio signal is obtained by a media production platform, the noisy audio signal can be manipulated to sound as if recording occurred with sophisticated equipment in a soundproof environment. Manipulation can be performed by a model that, when applied to the noisy audio signal, can manipulate its characteristics so as to emulate the characteristics of clean audio signals that are learned through training.
G10L 21/0232 - Traitement dans le domaine fréquentiel
G10L 25/30 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
G10L 25/18 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information spectrale de chaque sous-bande
G10L 25/21 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information sur la puissance
Different types of media experiences can be developed based on characteristics of the consumer. “Linear” experiences may require execution of a pre-built script, although the script could be dynamically modified by a media production platform. Linear experiences can include guided audio tours that are modified or updated based on the location of the consumer. “Enhanced” experiences include conventional media content that is supplemented with intelligent media content. For example, turn-by-turn directions could be supplemented with audio descriptions about the surrounding area. “Freeform” experiences, meanwhile, are those that can continually morph based on information gleaned from a consumer. For example, a radio station may modify what content is being presented based on the geographical metadata uploaded by a computing device associated with the consumer.
G06F 17/00 - Équipement ou méthodes de traitement de données ou de calcul numérique, spécialement adaptés à des fonctions spécifiques
G06F 3/04817 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] fondées sur des propriétés spécifiques de l’objet d’interaction affiché ou sur un environnement basé sur les métaphores, p. ex. interaction avec des éléments du bureau telles les fenêtres ou les icônes, ou avec l’aide d’un curseur changeant de comportement ou d’aspect utilisant des icônes
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p. ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
12.
SIMULTANEOUS RECORDING AND UPLOADING OF MULTIPLE AUDIO FILES OF THE SAME CONVERSATION AND AUDIO DRIFT NORMALIZATION SYSTEMS AND METHODS
The invention relates to simultaneous recording and uploading systems and methods, and, more particularly to a simultaneous recording and uploading multiple files from the same conversation.
Media content can be created and/or modified using a network-accessible platform. Scripts for content-based experiences could be readily created using one or more interfaces generated by the network-accessible platform. For example, a script for a content-based experience could be created using an interface that permits triggers to be inserted directly into the script. Interface(s) may also allow different media formats to be easily aligned for post-processing. For example, a transcript and an audio file may be dynamically aligned so that the network-accessible platform can globally reflect changes made to either item. User feedback may also be presented directly on the interface(s) so that modifications can be made based on actual user experiences.
G06F 16/61 - IndexationStructures de données à cet effetStructures de stockage
G06Q 10/101 - Création collaborative, p. ex. développement conjoint de produits ou de services
G10L 19/008 - Codage ou décodage du signal audio multi-canal utilisant la corrélation inter-canaux pour réduire la redondance, p. ex. stéréo combinée, codage d’intensité ou matriçage
G11B 27/031 - Montage électronique de signaux d'information analogiques numérisés, p. ex. de signaux audio, vidéo
G11B 27/10 - IndexationAdressageMinutage ou synchronisationMesure de l'avancement d'une bande
The invention relates to audio drift normalization, and more particularly to audio drift normalization systems and methods that can normalize audio drift of a plurality of recordings from a source.
Different types of media experiences can be developed based on characteristics of the consumer. “Linear” experiences may require execution of a pre-built script, although the script could be dynamically modified by a media production platform. Linear experiences can include guided audio tours that are modified or updated based on the location of the consumer. “Enhanced” experiences include conventional media content that is supplemented with intelligent media content. For example, turn-by-turn directions could be supplemented with audio descriptions about the surrounding area. “Freeform” experiences, meanwhile, are those that can continually morph based on information gleaned from a consumer. For example, a radio station may modify what content is being presented based on the geographical metadata uploaded by a computing device associated with the consumer.
G06F 17/00 - Équipement ou méthodes de traitement de données ou de calcul numérique, spécialement adaptés à des fonctions spécifiques
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p. ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes
G06F 3/04817 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] fondées sur des propriétés spécifiques de l’objet d’interaction affiché ou sur un environnement basé sur les métaphores, p. ex. interaction avec des éléments du bureau telles les fenêtres ou les icônes, ou avec l’aide d’un curseur changeant de comportement ou d’aspect utilisant des icônes
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
16.
Upsampling of audio using generative adversarial networks
Introduced here are approaches to training and then employing computer-implemented models designed to upsample discrete audio signals to higher sampling rates. Assume, for example, that a media production platform obtains a first discrete signal at a relatively low sampling rate. The relatively low sampling frequency may make the first discrete audio signal unsuitable for inclusion in media compilations, so the media production platform may attempt to improve its quality through upsampling. To accomplish this, the media production platform can apply a transform to the first discrete signal to produce a first magnitude spectrogram. Then, the media production platform can apply a computer-implemented model to the first magnitude spectrogram to produce a second magnitude spectrogram. Thereafter, the media production platform can apply an inverse transform to the second magnitude spectrogram to create a second discrete signal that has a higher sampling rate than the first discrete audio signal.
G10L 25/18 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information spectrale de chaque sous-bande
G06N 3/088 - Apprentissage non supervisé, p. ex. apprentissage compétitif
G10L 19/02 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p. ex. dans les vocodeursCodage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique utilisant l'analyse spectrale, p. ex. vocodeurs à transformée ou vocodeurs à sous-bandes
G10L 25/30 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
17.
Training generative adversarial networks to upsample audio
Introduced here are approaches to training and then employing computer-implemented models designed to upsample discrete audio signals to higher sampling rates. Assume, for example, that a media production platform obtains a first discrete signal at a relatively low sampling rate. The relatively low sampling frequency may make the first discrete audio signal unsuitable for inclusion in media compilations, so the media production platform may attempt to improve its quality through upsampling. To accomplish this, the media production platform can apply a transform to the first discrete signal to produce a first magnitude spectrogram. Then, the media production platform can apply a computer-implemented model to the first magnitude spectrogram to produce a second magnitude spectrogram. Thereafter, the media production platform can apply an inverse transform to the second magnitude spectrogram to create a second discrete signal that has a higher sampling rate than the first discrete audio signal.
G10L 25/18 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits les paramètres extraits étant l’information spectrale de chaque sous-bande
G06N 3/088 - Apprentissage non supervisé, p. ex. apprentissage compétitif
G10L 19/02 - Techniques d'analyse ou de synthèse de la parole ou des signaux audio pour la réduction de la redondance, p. ex. dans les vocodeursCodage ou décodage de la parole ou des signaux audio utilisant les modèles source-filtre ou l’analyse psychoacoustique utilisant l'analyse spectrale, p. ex. vocodeurs à transformée ou vocodeurs à sous-bandes
G10L 25/30 - Techniques d'analyse de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
18.
Tokenization of text data to facilitate automated discovery of speech disfluencies
Introduced here are computer programs and associated computer-implemented techniques for discovering the presence of filler words through tokenization of a transcript derived from audio content. When audio content is obtained by a media production platform, the audio content can be converted into text content as part of a speech-to-text operation. The text content can then be tokenized and labeled using a Natural Language Processing (NLP) library. Tokenizing/labeling may be performed in accordance with a series of rules associated with filler words. At a high level, these rules may examine the text content (and associated tokens/labels) to determine whether patterns, relationships, verbatim, and context indicate that a term is a filler word. Any filler words that are discovered in the text content can be identified as such so that appropriate action(s) can be taken.
Introduced here are computer programs and associated computer-implemented techniques for discovering the presence of filler words through tokenization of a transcript derived from audio content. When audio content is obtained by a media production platform, the audio content can be converted into text content as part of a speech-to-text operation. The text content can then be tokenized and labeled using a Natural Language Processing (NLP) library. Tokenizing/labeling may be performed in accordance with a series of rules associated with filler words. At a high level, these rules may examine the text content (and associated tokens/labels) to determine whether patterns, relationships, verbatim, and context indicate that a term is a filler word. Any filler words that are discovered in the text content can be identified as such so that appropriate action(s) can be taken.
Introduced here are computer programs and associated computer-implemented techniques for facilitating the creation of a master transcription (or simply “transcript”) that more accurately reflects underlying audio by comparing multiple independently generated transcripts. The master transcript may be used to record and/or produce various forms of media content, as further discussed below. Thus, the technology described herein may be used to facilitate editing of text content, audio content, or video content. These computer programs may be supported by a media production platform that is able to generate the interfaces through which individuals (also referred to as “users”) can create, edit, or view media content. For example, a computer program may be embodied as a word processor that allows individuals to edit voice-based audio content by editing a master transcript, and vice versa.
Introduced here are computer programs and associated computer-implemented techniques for facilitating the creation of a master transcription (or simply “transcript”) that more accurately reflects underlying audio by comparing multiple independently generated transcripts. The master transcript may be used to record and/or produce various forms of media content, as further discussed below. Thus, the technology described herein may be used to facilitate editing of text content, audio content, or video content. These computer programs may be supported by a media production platform that is able to generate the interfaces through which individuals (also referred to as “users”) can create, edit, or view media content. For example, a computer program may be embodied as a word processor that allows individuals to edit voice-based audio content by editing a master transcript, and vice versa.
The invention relates to simultaneous recording and uploading systems and methods, and, more particularly to a simultaneous recording and uploading of multiple files from the same conversation.
The invention relates to audio drift normalization, and more particularly to audio drift normalization systems and methods that can normalize audio drift of a plurality of recordings from a source.
Different types of media experiences can be developed based on characteristics of the consumer. “Linear” experiences may require execution of a pre-built script, although the script could be dynamically modified by a media production platform. Linear experiences can include guided audio tours that are modified or updated based on the location of the consumer. “Enhanced” experiences include conventional media content that is supplemented with intelligent media content. For example, turn-by-turn directions could be supplemented with audio descriptions about the surrounding area. “Freeform” experiences, meanwhile, are those that can continually morph based on information gleaned from a consumer. For example, a radio station may modify what content is being presented based on the geographical metadata uploaded by a computing device associated with the consumer.
G06F 17/00 - Équipement ou méthodes de traitement de données ou de calcul numérique, spécialement adaptés à des fonctions spécifiques
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p. ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes
G06F 3/04817 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] fondées sur des propriétés spécifiques de l’objet d’interaction affiché ou sur un environnement basé sur les métaphores, p. ex. interaction avec des éléments du bureau telles les fenêtres ou les icônes, ou avec l’aide d’un curseur changeant de comportement ou d’aspect utilisant des icônes
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
25.
Platform for producing and delivering media content
Media content can be created and/or modified using a network-accessible platform. Scripts for content-based experiences could be readily created using one or more interfaces generated by the network-accessible platform. For example, a script for a content-based experience could be created using an interface that permits triggers to be inserted directly into the script. Interface(s) may also allow different media formats to be easily aligned for post-processing. For example, a transcript and an audio file may be dynamically aligned so that the network-accessible platform can globally reflect changes made to either item. User feedback may also be presented directly on the interface(s) so that modifications can be made based on actual user experiences.
G11B 27/031 - Montage électronique de signaux d'information analogiques numérisés, p. ex. de signaux audio, vidéo
G10L 19/008 - Codage ou décodage du signal audio multi-canal utilisant la corrélation inter-canaux pour réduire la redondance, p. ex. stéréo combinée, codage d’intensité ou matriçage
G06F 16/61 - IndexationStructures de données à cet effetStructures de stockage
Different types of media experiences can be developed based on characteristics of the consumer. “Linear” experiences may require execution of a pre-built script, although the script could be dynamically modified by a media production platform. Linear experiences can include guided audio tours that are modified or updated based on the location of the consumer. “Enhanced” experiences include conventional media content that is supplemented with intelligent media content. For example, turn-by-turn directions could be supplemented with audio descriptions about the surrounding area. “Freeform” experiences, meanwhile, are those that can continually morph based on information gleaned from a consumer. For example, a radio station may modify what content is being presented based on the geographical metadata uploaded by a computing device associated with the consumer.
G06F 17/20 - Manipulation de données en langage naturel
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p. ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
G10L 15/26 - Systèmes de synthèse de texte à partir de la parole
G10L 15/187 - Contexte phonémique, p. ex. règles de prononciation, contraintes phonotactiques ou n-grammes de phonèmes
G06F 3/0481 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] fondées sur des propriétés spécifiques de l’objet d’interaction affiché ou sur un environnement basé sur les métaphores, p. ex. interaction avec des éléments du bureau telles les fenêtres ou les icônes, ou avec l’aide d’un curseur changeant de comportement ou d’aspect
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p. ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
27.
Platform for producing and delivering media content
Media content can be created and/or modified using a network-accessible platform. Scripts for content-based experiences could be readily created using one or more interfaces generated by the network-accessible platform. For example, a script for a content-based experience could be created using an interface that permits triggers to be inserted directly into the script. Interface(s) may also allow different media formats to be easily aligned for post-processing. For example, a transcript and an audio file may be dynamically aligned so that the network-accessible platform can globally reflect changes made to either item. User feedback may also be presented directly on the interface(s) so that modifications can be made based on actual user experiences.
G11B 27/031 - Montage électronique de signaux d'information analogiques numérisés, p. ex. de signaux audio, vidéo
G10L 21/02 - Amélioration de l'intelligibilité de la parole, p. ex. réduction de bruit ou annulation d'écho
G10L 19/008 - Codage ou décodage du signal audio multi-canal utilisant la corrélation inter-canaux pour réduire la redondance, p. ex. stéréo combinée, codage d’intensité ou matriçage
G06F 16/61 - IndexationStructures de données à cet effetStructures de stockage