Certain aspects of the present disclosure provide techniques and apparatuses for observing an environment and predicting a future state of the environment using environment-agnostic causal representation machine learning models. An example method generally includes receiving, at a computing system, an image of an environment and an indication of an action to be performed in the environment. Using an encoder neural network trained to generate embedding representations of images from a plurality of environments into a common embedding space, an embedding representation of the image is generated. Using a predictive neural network, causal variables for a future state of the environment are predicted based on the action and the embedding representation of the image. Based on the predicted causal variables, the future state of the environment after execution of the action is predicted.
A device includes a memory configured to store data corresponding to a trained transformer for next image patch prediction. The device also includes one or more processors coupled to the memory. The one or more processors are configured to obtain an input image and to generate input data that corresponds to a sequence of patches of the input image, the patches arranged to preserve an aspect ratio of the input image. The one or more processors are also configured to process the input data at the trained transformer to generate a transformer output that corresponds to one or more predicted next patches of the sequence.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
A processor-implemented method for implementing graph cuts for explainability using an artificial neural network (ANN) includes receiving, via the ANN, an input. The input is represented as a graph. The graph includes nodes connected by edges. The ANN determines a graph cut between a source node and a sink node associated with the input by solving a quadratic process with equality constraints. The ANN processes a subset of the input based on the graph cut to generate a prediction.
G06V 10/762 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant le regroupement, p. ex. de visages similaires sur les réseaux sociaux
G06N 3/042 - Réseaux neuronaux fondés sur la connaissanceReprésentations logiques de réseaux neuronaux
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
G06V 10/77 - Traitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source
G06V 10/771 - Sélection de caractéristiques, p. ex. sélection des caractéristiques représentatives à partir d’un espace multidimensionnel de caractéristiques
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/70 - Étiquetage du contenu de scène, p. ex. en tirant des représentations syntaxiques ou sémantiques
4.
MEMORY DEVICE AND METHOD FOR PROTECTING A MEMORY DEVICE FROM THE EFFECT OF ROW HAMMERING
A memory device comprises a DRAM unit formed by a plurality of memory point matrices subdivided into a plurality of memory blocks, and activation counters respectively associated with the memory blocks. The protection method comprises, with each activation of a row of a given block, incrementing the activation counter associated with a given block, and initiating a preventive refresh of all rows of the given block when the activation counter of this given block exceeds a threshold value. The method also comprises at least one of: including, in the preventive refresh, at least some of the rows of at least one block directly adjacent to the given block; incrementing, during activation of the row(s) of the given block, the activation counter of at least one directly adjacent block; and initiating the preventive refresh of all the rows of the adjacent block when the activation counter exceeds a threshold value.
G11C 11/4078 - Circuits de sécurité ou de protection, p. ex. afin d'empêcher la lecture ou l'écriture intempestives ou non autoriséesCellules d'étatCellules de test
G11C 11/406 - Organisation ou commande des cycles de rafraîchissement ou de régénération de la charge
Systems and techniques are provided for processing image data. For example, a process can include processing a source image to generate a first features for the source image and a target image to generate a second features for the target image. The process can include generating a first cluster map for the source image based on prototypes and the first features for the source image, and generating a second cluster map for the target image based on the prototypes and the second features for the target image. The process can include determining a propagated cluster map for the source image based on the first cluster map and a correspondence between regions of the source image and regions of the target image. The process can include determining a loss based on a comparison of the propagated cluster map for the source image and the second cluster map for the target image.
G06V 10/762 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant le regroupement, p. ex. de visages similaires sur les réseaux sociaux
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
G06V 10/74 - Appariement de motifs d’image ou de vidéoMesures de proximité dans les espaces de caractéristiques
6.
MEMORY DEVICE PROVIDED WITH DRAM MEMORY CIRCUITS ARRANGED IN SUCH A WAY AS TO MINIMIZE THE SIZE OF A MEMORY BLOCK ALLOWING MANAGEMENT OF THE ROW-HAMMERING EFFECT
The invention relates to a memory device comprising:—DRAM memory circuits (100), the total capacity of which is divided into a first part (102) and a second part (103) larger than the first part (102);—a control circuit configured to access the memory circuits, the control circuit comprising:—a first block (201) configured to execute a first algorithm (201A) intended to protect the first part (102) from a row-hammering effect;—a second block (202) configured to execute a second algorithm (202A) intended to protect the second part (103) from a row-hammering effect that may occur, the second algorithm (202A) using a main table stored in the first part (102).
Techniques are described herein for processing video data. For example, a computing device can project, via a projection network, the plurality of patches to generate features representing the plurality of patches. The computing device can generate a first similarity score based on a projection of the features representing the plurality of patches onto features of a prototype. The computing device can generate a second similarity score based on a projection of predicted masked target projection tokens output from a decoder onto the features of the prototype. The computing device can predict, via a clustering engine, a first cluster assignment based on the second similarity score. The computing device can predict, via the clustering engine, a second cluster assignment based on the first similarity score. The computing device can determine, via the clustering engine, a loss based on the first cluster assignment and the second cluster assignment.
Systems, methods, and computer-readable media are described. An example system for processing data includes one or more memories that store radar data from a radar system. The radar data includes frequency domain data. The one or more memories also store image data from a plurality of camera sensors. The system includes one or more processors configured to encode the image data to generate encoded image data. The one or more processors are configured to encode the frequency domain data using an encoder to generate encoded radar data. The one or more processors are configured to fuse the encoded radar data and the encoded image data to generate fused data. The one or more processors are configured to navigate a vehicle based on the fused data.
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
9.
EFFICIENT ATTENTION USING SOFT MASKING AND SOFT CHANNEL PRUNING
A processor-implemented method includes configuring a transformer model having multiple attention heads. Each attention head has a set of architecture parameters and weight parameters. The set of architecture parameters are determined for each attention head based on using a soft pruning technique according to a fixed training budget. In turn, the transformer model generates an inference based on the set architecture parameters and an input.
A processor-implemented method includes configuring a transformer model having multiple attention heads. Each attention head has a set of architecture parameters and weight parameters. The set of architecture parameters are determined for each attention head based on using a soft pruning technique according to a fixed training budget. In turn, the transformer model generates an inference based on the set architecture parameters and an input.
Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. In an example method, a current program state comprising a set of program instructions is accessed. A next program instruction is generated using a search operation, comprising generating a probability of the next program instruction based on processing the current program state and the next program instruction using a machine learning model, and generating a value of the next program instruction based on processing the current program state, the next program instruction, and a set of alternative outcomes using the machine learning model. An updated program state is generated based on adding the next program instruction to the set of program instructions.
Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. In an example method, a current program state comprising a set of program instructions is accessed. A next program instruction is generated using a search operation, comprising generating a probability of the next program instruction based on processing the current program state and the next program instruction using a machine learning model, and generating a value of the next program instruction based on processing the current program state, the next program instruction, and a set of alternative outcomes using the machine learning model. An updated program state is generated based on adding the next program instruction to the set of program instructions.
A device may one or more memories storing the frontal view image. A device may one or more processors coupled to the one or more memories and configured to: obtain, based on the frontal view image and via an implicit field engine in a geometry pathway, a depth map, obtain, based on the frontal view image, a masked image, generate, based on the masked image and via a semantic pathway, a reconstructed image, train, based on the depth map and the reconstructed image, a model; and finetune the model using a portion of task-specific labels to obtain a finetuned model that performs semantic view mapping on input images.
G06T 3/4007 - Changement d'échelle d’images complètes ou de parties d’image, p. ex. agrandissement ou rétrécissement basé sur l’interpolation, p. ex. interpolation bilinéaire
G06T 7/50 - Récupération de la profondeur ou de la forme
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
G06V 10/80 - Fusion, c.-à-d. combinaison des données de diverses sources au niveau du capteur, du prétraitement, de l’extraction des caractéristiques ou de la classification
14.
USING NEURAL RADIANCE FIELDS FOR LABEL EFFICIENT IMAGE PROCESSING
A device may one or more memories storing the frontal view image. A device may one or more processors coupled to the one or more memories and configured to: obtain, based on the frontal view image and via an implicit field engine in a geometry pathway, a depth map, obtain, based on the frontal view image, a masked image, generate, based on the masked image and via a semantic pathway, a reconstructed image, train, based on the depth map and the reconstructed image, a model; and finetune the model using a portion of task-specific labels to obtain a finetuned model that performs semantic view mapping on input images.
G06V 10/26 - Segmentation de formes dans le champ d’imageDécoupage ou fusion d’éléments d’image visant à établir la région de motif, p. ex. techniques de regroupementDétection d’occlusion
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p. ex. des objets vidéo
G06V 10/80 - Fusion, c.-à-d. combinaison des données de diverses sources au niveau du capteur, du prétraitement, de l’extraction des caractéristiques ou de la classification
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/56 - Contexte ou environnement de l’image à l’extérieur d’un véhicule à partir de capteurs embarqués
G06V 20/58 - Reconnaissance d’objets en mouvement ou d’obstacles, p. ex. véhicules ou piétonsReconnaissance des objets de la circulation, p. ex. signalisation routière, feux de signalisation ou routes
15.
LEARNABLE DEFORMATION FOR POINT CLOUD SELF-SUPERVISED LEARNING
A processor-implemented method includes obtaining, with a backbone artificial neural network, an original feature map of point cloud data. The method also includes deforming the point cloud data, with a deformation artificial neural network, into a number of deformed point cloud objects based on the original feature map of point cloud data. The method further includes combining the deformed point cloud objects into a mixed point cloud. The method still further includes extracting, with the backbone artificial neural network, a mixed feature map from the mixed point cloud. The method includes extracting a number of deformed feature maps from the deformed point cloud objects. The method still further includes computing, with a contrastive module, a loss for the backbone artificial neural network and for the deformation artificial neural network based on the mixed feature map and the deformed feature maps.
Systems and techniques are described herein for adapting a pretrained machine learning model. For instance, a process can include encoding a training image into a first feature vector, the training image including a first object located at a first location; generating a second feature vector based on a set of sinusoidal functions using a set of weights; combining the first feature vector with a second feature vector to generate a combined feature vector; processing the combined feature vector using a visual language model to obtain a second location for the first object; and adjusting the set of weights based on a comparison between the first location and the second location.
G06V 10/86 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les représentations syntaxiques ou structurelles du motif d’image ou vidéo, p. ex. reconnaissance des chaînes symboliquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant des correspondances graphiques
G06T 7/73 - Détermination de la position ou de l'orientation des objets ou des caméras utilisant des procédés basés sur les caractéristiques
G06T 11/60 - Édition de figures et de texteCombinaison de figures ou de texte
G06V 10/40 - Extraction de caractéristiques d’images ou de vidéos
G06V 10/774 - Génération d'ensembles de motifs de formationTraitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source méthodes de Bootstrap, p. ex. "bagging” ou “boosting”
G06F 40/58 - Utilisation de traduction automatisée, p. ex. pour recherches multilingues, pour fournir aux dispositifs clients une traduction effectuée par le serveur ou pour la traduction en temps réel
17.
LEARNABLE DEFORMATION FOR POINT CLOUD SELF-SUPERVISED LEARNING
A processor-implemented method includes obtaining, with a backbone artificial neural network, an original feature map of point cloud data. The method also includes deforming the point cloud data, with a deformation artificial neural network, into a number of deformed point cloud objects based on the original feature map of point cloud data. The method further includes combining the deformed point cloud objects into a mixed point cloud. The method still further includes extracting, with the backbone artificial neural network, a mixed feature map from the mixed point cloud. The method includes extracting a number of deformed feature maps from the deformed point cloud objects. The method still further includes computing, with a contrastive module, a loss for the backbone artificial neural network and for the deformation artificial neural network based on the mixed feature map and the deformed feature maps.
G06T 19/20 - Édition d'images tridimensionnelles [3D], p. ex. modification de formes ou de couleurs, alignement d'objets ou positionnements de parties
G06V 10/77 - Traitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
18.
OBJECT DETECTION USING VISUAL LANGUAGE MODELS VIA LATENT FEATURE ADAPTATION WITH SYNTHETIC DATA
Systems and techniques are described herein for adapting a pretrained machine learning model. For instance, a process can include encoding a training image into a first feature vector, the training image including a first object located at a first location; generating a second feature vector based on a set of sinusoidal functions using a set of weights; combining the first feature vector with a second feature vector to generate a combined feature vector; processing the combined feature vector using a visual language model to obtain a second location for the first object; and adjusting the set of weights based on a comparison between the first location and the second location.
G06V 10/774 - Génération d'ensembles de motifs de formationTraitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source méthodes de Bootstrap, p. ex. "bagging” ou “boosting”
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/70 - Étiquetage du contenu de scène, p. ex. en tirant des représentations syntaxiques ou sémantiques
19.
ADAPTIVE SAMPLING FOR EQUIVARIANT MACHINE LEARNING MODELS
Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A feature tensor generated based on a model input to a machine learning model is accessed. A sampling matrix is generated based on the model input. An activation output is generated using an activation layer of the machine learning model based on the feature tensor and the sampling matrix, and the activation output is provided as output from the activation layer of the machine learning model.
Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A feature tensor generated based on a model input to a machine learning model is accessed. A sampling matrix is generated based on the model input. An activation output is generated using an activation layer of the machine learning model based on the feature tensor and the sampling matrix, and the activation output is provided as output from the activation layer of the machine learning model.
The invention relates to a memory device that comprises: —a memory bank provided with n memory rows, each row i being liable to effect a row hammer having a range p; —a block for preventing the hammer effect which comprises counting means implementing m hammer counters, each counter k being associated with one or more of the rows i, and is configured to increment a count k by an increment value kN, the increment value kN being a decreasing function of the duration TPN and also a function of the duration TAN, the increment value kN quantifying the effect of the hammer from the one or more rows i on rows j within hammering range; —a row refresh block configured to refresh one or more rows as soon as a count k reaches a threshold value M.
The present invention relates to the field of machine learning operations. More particularly, it is directed to apparatus, method, and systems for portable machine learning operations adapted to operate on dissimilar computing systems and provide substantially similar results. A method for exact computation of an optimized set of Lookup Table operations within machine learning is provided. The exact outputs can be used to then further create operations that produce exactly correct results based on the implementation.
Certain aspects of the present disclosure provide techniques and apparatus for efficiently adapting a machine learning model from a base task to a downstream task based on frozen matrices. An example method generally includes receiving an input for processing through a layer of a neural network. An output of the layer of the neural network is generated based on a first product, the first product being based on a first trainable scaling vector, a first frozen matrix, a second trainable scaling vector, a second frozen matrix, and the received input.
Systems and techniques are described for providing iterative policy-guided program synthesis. For example, a device may generate, based on a policy that receives input-output data of one or more tasks as input, a first set of programs, add the first set of programs and the input-output data to the training dataset to generate an updated training dataset, train the policy based on the first set of programs and the input-output data to generate an updated policy, identify, based on the updated policy, a second set of programs for second input-output data for a second set of tasks, add the second set of programs and second input-output data to the updated training dataset to generate a second updated training dataset; and train the updated policy based on the second set of programs and the second input-output data to generate a second updated policy.
Certain aspects of the present disclosure provide techniques and apparatus for efficiently adapting a machine learning model from a base task to a downstream task based on frozen matrices. An example method generally includes receiving an input for processing through a layer of a neural network. An output of the layer of the neural network is generated based on a first product, the first product being based on a first trainable scaling vector, a first frozen matrix, a second trainable scaling vector, a second frozen matrix, and the received input.
Bare die package with guard to reduce or prevent material seepage into an air cavity, and related fabrication methods. In exemplary aspects, to avoid or reduce material (e.g., an encapsulation material such as a mold material and/or a coating material) from entering or seeping into the air cavity in the active filter region of a filter, the die package includes a guard structure. The guard structure is a structure on or adjacent to the die operable to be used in a filter that redirects or reduces material from entering the gap between the die and the substrate. The guard structure reduces or prevents the material entering the air cavity of the die so as to avoid such material affecting the acoustic performance of the air cavity of the filter.
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 23/31 - Encapsulations, p. ex. couches d’encapsulation, revêtements caractérisées par leur disposition
H01L 23/498 - Connexions électriques sur des substrats isolants
H01L 25/00 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
27.
DIE PACKAGE WITH GUARD STRUCTURE TO REDUCE OR PREVENT MATERIAL SEEPAGE INTO AIR CAVITY, AND RELATED FABRICATION METHODS
Bare die package with guard to reduce or prevent material seepage into an air cavity, and related fabrication methods. In exemplary aspects, to avoid or reduce material (e.g., an encapsulation material such as a mold material and/or a coating material) from entering or seeping into the air cavity in the active filter region of a filter, the die package includes a guard structure. The guard structure is a structure on or adjacent to the die operable to be used in a filter that redirects or reduces material from entering the gap between the die and the substrate. The guard structure reduces or prevents the material entering the air cavity of the die so as to avoid such material affecting the acoustic performance of the air cavity of the filter.
Certain aspects of the present disclosure provide techniques and apparatus for improved machine learning. A set of training data is accessed, and a transformation group comprising a plurality of group elements is determined. A set of unconstrained weights for a layer of the machine learning model is generated based on the set of training data. A set of parameter values for a likelihood function for the layer is generated based on the set of training data. A set of constrained weights is generated, based at least in part on the likelihood function and the set of unconstrained weights, such that the set of constrained weights is equivariant with respect to at least a subset of the plurality of group elements.
Certain aspects of the present disclosure provide techniques and apparatus for improved program synthesis using machine learning. An input indicating a programming task is accessed. A generated program is generated based on processing the input using a trained machine learning model. In response to determining that the generated program failed to satisfy the programming task, feedback is generated, and a revised program is generated based on processing the feedback using the trained machine learning model. In response to determining that the revised program satisfied the programming task, one or more parameters of the trained machine learning model are updated based on the revised program.
Certain aspects of the present disclosure provide techniques and apparatus for improved program synthesis using machine learning. An input indicating a programming task is accessed. A generated program is generated based on processing the input using a trained machine learning model. In response to determining that the generated program failed to satisfy the programming task, feedback is generated, and a revised program is generated based on processing the feedback using the trained machine learning model. In response to determining that the revised program satisfied the programming task, one or more parameters of the trained machine learning model are updated based on the revised program.
The present disclosure relates to methods and apparatus for graphics processing. The apparatus may identify at least one mesh associated with at least one frame. The apparatus may also divide the at least one mesh into a plurality of groups of primitives, each of the plurality of groups of primitives including at least one primitive and a plurality of vertices. The apparatus may also compress the plurality of groups of primitives into a plurality of groups of compressed primitives, the plurality of groups of compressed primitives being associated with random access. Additionally, the apparatus may decompress the plurality of groups of compressed primitives, at least one first group of the plurality of groups of compressed primitives being decompressed in parallel with at least one second group of the plurality of groups of compressed primitives.
Techniques and systems are provided for image processing. For instance, a process can include obtaining, from one or more image sensors, a first image and a second image; determining local motion between the first image and the second image for features of the first image and the second image; generating motion vectors based on the local motion; and identifying an object based on the motion vectors.
Some embodiments include a method of generating an environment reference model for positioning comprising: receiving multiple data sets representing a scanned environment including information about a type of sensor used and data for determining an absolute position of objects or feature points represented by the data sets; extracting one or more objects or feature points from each data set; determining a position of each object or feature point in a reference coordinate system; generating a three-dimensional vector representation of the scanned environment aligned with the reference coordinate system including representation of the objects or feature points at corresponding locations, creating links between the objects or feature points in the three dimensional vector model with an identified type of sensor by which they can be detected in the environment; and storing the three-dimensional vector model representation and the links in a retrievable manner.
Embodiments include methods, and processing devices for implementing the methods. Various embodiments may include calculating a batch softmax normalization factor using a plurality of logit values from a plurality of logits of a layer of a neural network, normalizing the plurality of logit values using the batch softmax normalization factor, and mapping each of the normalized plurality of logit values to one of a plurality of manifolds in a coordinate space. In some embodiments, each of the plurality of manifolds represents a number of labels to which a logit can be classified. In some embodiments, at least one of the plurality of manifolds represents a number of labels other than one label.
Methods, systems, and apparatuses to determine a first topology of a reference skeleton model and a second topology of a sensed skeleton model. The first topology and the second topology each identifying and characterizing one or more nodes. In some examples, an apparatus may perform operations that adjust a positioning of one or more data points of the second dataset based at least on the one or more nodes of the first topology and the one or more nodes of the second topology.
This disclosure provides systems, methods, and devices for vehicle driving assistance systems that support image processing. In one embodiment, a computing device may receive a plurality of point clouds containing position information for a scene and may determine corresponding trajectories for the plurality of point clouds. The computing device may determine, based on the corresponding trajectories, a baseline trajectory for the scene and may determine a plurality of projected coordinate sets based on the plurality of point clouds and the baseline trajectory. The computing device may determine common objects that are present within at least two of the plurality of projected coordinate sets and may determine a transformation matrix based on the common objects. A combined point cloud for the scene may be determined by applying the transformation matrix to at least a subset of the plurality of point clouds. Other aspects and features are also claimed and described.
According to the invention a method is provided for accessing a resource of a control unit comprising: executing a virtualization system on at least one processor of the control unit, the virtualization system including an interpreter (28) for bytecode and/or a script, the virtualization system assigning processor time and memory space to at least one guest system; executing a first guest system (24) running on the virtualization system; emitting, by the first guest system (24), an access request (38) to access a resource (5, 7, 32) of the control unit to the virtualization system (22); determining, by the virtualization system (22), that the access request is not allowable for the first guest system (24); loading by the interpreter (28) bytecode instructions and/or script instructions based on the request of the first guest system (24); and executing, by the interpreter (28), the loaded bytecode or script instructions to access the resource (5, 7, 32).
G06F 9/455 - ÉmulationInterprétationSimulation de logiciel, p. ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
G06F 21/53 - Contrôle des utilisateurs, des programmes ou des dispositifs de préservation de l’intégrité des plates-formes, p. ex. des processeurs, des micrologiciels ou des systèmes d’exploitation au stade de l’exécution du programme, p. ex. intégrité de la pile, débordement de tampon ou prévention d'effacement involontaire de données par exécution dans un environnement restreint, p. ex. "boîte à sable" ou machine virtuelle sécurisée
38.
APPARATUS AND METHODS FOR A ROBUST OUTLIER REJECTION
Methods, systems, and apparatus to determine a threshold for an iterative process configured to generate sets of model parameters for another process that determines pose estimations. For example, an apparatus may determine a first set of model parameters of a process based on a subset of the match data and determine a first threshold based on an uncertainty parameter. Additionally, the apparatus may apply the process to the match data in accordance with the first set of model parameters. Further, the apparatus may determine a number of inlier data elements based on the application of the process to the match data and the determined first threshold. In some examples, each inlier data element may characterize a particular three-dimensional data point and corresponding two-dimensional data point that is within the determined first threshold.
G06V 10/75 - Organisation de procédés de l’appariement, p. ex. comparaisons simultanées ou séquentielles des caractéristiques d’images ou de vidéosApproches-approximative-fine, p. ex. approches multi-échellesAppariement de motifs d’image ou de vidéoMesures de proximité dans les espaces de caractéristiques utilisant l’analyse de contexteSélection des dictionnaires
Methods, systems, and apparatuses are provided to classify features of a geographical area based on map information. For example, a computing device receives map data characterizing roadways of a geographical area. Further, the computing device determines, based on the map data, a plurality of nodes and a plurality of segments connecting the plurality of nodes. The computing device filters the plurality of nodes based on a first feature to determine a portion of the plurality of nodes, and filters the plurality of segments based on a second feature to determine a portion of the plurality of segments. Further, the computing device clusters the portion of the plurality of segments and the portion of the plurality of nodes, and generates final clusters based on the clustered portions of the plurality of segments and the plurality of nodes. The computing device then generates classification data classifying the final clusters.
Systems and techniques for environment mapping are described. In some examples, a system receives image data and depth data captured using at least one sensor. The image data and the depth data both include respective representations of an environment. The system processes the image data using semantic segmentation to identify segments of the environment that represent different types of objects in the environment in the image data. The system combines the depth data with the semantic segmentation to generate a voxel-based three-dimensional map of the environment.
Certain aspects of the present disclosure provide techniques for pose estimation for three-dimensional object reconstruction. In one example, a method, includes receiving image data, wherein the image data comprises a plurality of images taken from varying poses; identifying one or more pairs of spatially related images within the plurality of images; generating a synchronization graph indicative of at least one similarity metric between the plurality of images, based at least in part on the identified one of more pairs of spatially related images; and estimating a pose of an object depicted in the plurality of images based on the synchronization graph.
Methods, systems, and apparatuses are provided to track features across multiple images for use in various systems. For example, a computing device receives at least a first image and a second image captured by a camera, and detects a feature within each of the first image and the second image. The feature is located at a first feature position within the first image and at a second feature position within the second image. The computing device also receives a first sensor pose of the sensor used to capture the first image and a second sensor pose of the sensor used to capture the second image. The computing device determines a portion of third image based on the first sensor pose, the second sensor pose, the first feature position, and the second feature position. The computing device then generates feature detection data characterizing whether the feature is detected.
A method for classifying a human-object interaction includes identifying a human-object interaction in the input. Context features of the input are identified. Each identified context feature is compared with the identified human-object interaction. An importance of the identified context feature is determined for the identified human-object interaction. The context feature is fused with the identified human-object interaction when the importance is greater than a threshold.
G06V 10/00 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
G06V 10/75 - Organisation de procédés de l’appariement, p. ex. comparaisons simultanées ou séquentielles des caractéristiques d’images ou de vidéosApproches-approximative-fine, p. ex. approches multi-échellesAppariement de motifs d’image ou de vidéoMesures de proximité dans les espaces de caractéristiques utilisant l’analyse de contexteSélection des dictionnaires
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p. ex. des objets vidéo
G06V 10/778 - Apprentissage de profils actif, p. ex. apprentissage en ligne des caractéristiques d’images ou de vidéos
G06V 10/80 - Fusion, c.-à-d. combinaison des données de diverses sources au niveau du capteur, du prétraitement, de l’extraction des caractéristiques ou de la classification
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/00 - ScènesÉléments spécifiques à la scène
G06V 20/52 - Activités de surveillance ou de suivi, p. ex. pour la reconnaissance d’objets suspects
G06V 40/10 - Corps d’êtres humains ou d’animaux, p. ex. occupants de véhicules automobiles ou piétonsParties du corps, p. ex. mains
44.
Memory device provided with dram memory circuits arranged in such a way as to minimize the size of a memory block allowing management of the row-hammering
The invention relates to a memory device comprising: —DRAM memory circuits (100), the total capacity of which is divided into a first part (102) and a second part (103) larger than the first part (102); —a control circuit configured to access the memory circuits, the control circuit comprising: —a first block (201) configured to execute a first algorithm (201A) intended to protect the first part (102) from a row-hammering effect; —a second block (202) configured to execute a second algorithm (202A) intended to protect the second part (103) from a row-hammering effect that may occur, the second algorithm (202A) using a main table stored in the first part (102).
A pose tracking method and system, and an apparatus and a storage medium. The method includes: acquiring a current frame image of a mobile device whereon is disposed with a plurality of positioning lamps; extracting, based on the current frame image, light spot features corresponding to the positioning lamps; acquiring inertial measurement data; determining whether a pose has been initialized; if so, acquiring, as a reference pose, a pose corresponding to a previous frame image, performing tracking and matching based on the light spot features, the reference pose and the inertial measurement data, and obtaining, as a light spot serial number, a serial number of the positioning lamp corresponding to a light spot on the mobile device; otherwise, performing an initialization search based on the light spot features to obtain, as the light spot serial number, the serial number of the positioning lamp corresponding to the light spot on the mobile device; and obtaining a current pose of the mobile device at the current frame image time by fusing the light spot serial number and the inertial measurement data in a tightly coupled manner. According to the embodiments of the present invention, the robustness of pose tracking is improved, and user experience is optimized.
A control method and system, a tracking method and system, a device, and a storage medium are provided. The control method includes: acquiring, using a detector (200), an image of a movable device (100) at a current frame time, the movable device (100) being equipped with a plurality of light output units (110) for outputting signal light (S1); predicting, based on the image, a motion state of the movable device (100) at a next frame time (S2); and obtaining, based on the motion state of the movable device (100) at the next frame time, configuration information of a light output state of each of the light output units (110) at the next frame time, to control the light output unit (110) (S3). The present invention is conductive to adaptively adjusting the light output states of all the light output units (110) based on a real-time state of the movable device (100), which facilitates the light output units (110) to achieve accurate and low-power light output states to ensure the use accuracy of the movable device(100) and reduce the power consumption of the movable device (100).
Methods, systems, and apparatuses are provided to cluster and match image feature descriptors for use in various systems. For example, a computing device receives a location from a remote device. The computing device applies a first clustering process to a plurality of descriptors associated with the location to determine a number of descriptor clusters. The computing device also applies a second clustering process to the number of descriptor clusters to determine a descriptor cluster center for each of the number of descriptor clusters. Further, the computing device generates descriptor cluster data characterizing a similarity between the plurality of descriptors and the descriptor cluster centers. The computing device then transmits the descriptor cluster data to the remote device. The remote device may match descriptors to the descriptor cluster centers based on the descriptor cluster data.
G06V 10/74 - Appariement de motifs d’image ou de vidéoMesures de proximité dans les espaces de caractéristiques
G06V 10/762 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant le regroupement, p. ex. de visages similaires sur les réseaux sociaux
G06V 20/56 - Contexte ou environnement de l’image à l’extérieur d’un véhicule à partir de capteurs embarqués
A method for generating a causal graph includes receiving a data set including observation data and intervention data corresponding to multiple variables. A probability distribution is determined for each variable based on the observation data. A likelihood of including each edge in the graph is computed based on the probability distribution and the intervention data. Each edge is a causal connection between variables of the multiple variables. The graph is generated based on the likelihood of including each edge. The graph may be updated by iteratively repeating the determination of the probability distribution and the computing of the likelihood of including each edge.
Certain aspects of the present disclosure provide techniques and apparatuses for inferencing against a multidimensional point cloud using a machine learning model. An example method generally includes generating a score for each respective point in a multidimensional point cloud using a scoring neural network. Points in the multidimensional point cloud are ranked based on the generated score for each respective point in the multidimensional point cloud. The top points are selected from the ranked multidimensional point cloud, and one or more actions are taken based on the selected top k points.
G06V 10/77 - Traitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source
G06V 10/42 - Extraction de caractéristiques globales par l’analyse du motif entier, p. ex. utilisant des transformations dans le domaine de fréquence ou d’autocorrélation
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p. ex. des objets vidéo
G06V 10/774 - Génération d'ensembles de motifs de formationTraitement des caractéristiques d’images ou de vidéos dans les espaces de caractéristiquesDispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p. ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]Séparation aveugle de source méthodes de Bootstrap, p. ex. "bagging” ou “boosting”
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/70 - Étiquetage du contenu de scène, p. ex. en tirant des représentations syntaxiques ou sémantiques
50.
PROCESSING IMAGES USING TEMPORALLY-PROPAGATED CLUSTER MAPS
Systems and techniques are provided for processing image data. For example, a process can include processing a source image to generate a first features for the source image and a target image to generate a second features for the target image. The process can include generating a first cluster map for the source image based on prototypes and the first features for the source image, and generating a second cluster map for the target image based on the prototypes and the second features for the target image. The process can include determining a propagated cluster map for the source image based on the first cluster map and a correspondence between regions of the source image and regions of the target image. The process can include determining a loss based on a comparison of the propagated cluster map for the source image and the second cluster map for the target image.
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p. ex. des objets vidéo
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/40 - ScènesÉléments spécifiques à la scène dans le contenu vidéo
51.
SELF-SUPERVISED POINT CLOUD ORDERING USING MACHINE LEARNING MODELS
An occupancy map processing method includes: obtaining an occupancy map of a region comprising a plurality of cells, corresponding to sub-regions, each including an occupancy indication indicative of occupier type of the sub-region, and the plurality of cells comprising delimiter cells and non-delimiter cells; and providing, from the apparatus, occupancy information comprising first occupancy information corresponding to the delimiter cells and either second occupancy information corresponding to fewer than all of the non-delimiter cells or no second occupancy information.
G06V 20/56 - Contexte ou environnement de l’image à l’extérieur d’un véhicule à partir de capteurs embarqués
G06V 20/58 - Reconnaissance d’objets en mouvement ou d’obstacles, p. ex. véhicules ou piétonsReconnaissance des objets de la circulation, p. ex. signalisation routière, feux de signalisation ou routes
A processor-implemented method for implementing graph cuts for explainability using an artificial neural network (ANN) includes receiving, via the ANN, an input. The input is represented as a graph. The graph includes nodes connected by edges. The ANN determines a graph cut between a source node and a sink node associated with the input by solving a quadratic process with equality constraints. The ANN processes a subset of the input based on the graph cut to generate a prediction.
An occupancy map processing method includes: obtaining an occupancy map of a region comprising a plurality of cells, corresponding to sub-regions, each including an occupancy indication indicative of occupier type of the sub-region, and the plurality of cells comprising delimiter cells and non-delimiter cells; and providing, from the apparatus, occupancy information comprising first occupancy information corresponding to the delimiter cells and either second occupancy information corresponding to fewer than all of the non-delimiter cells or no second occupancy information.
A method for managing model updates by a first zone server, associated with a first zone model of a plurality of zone models, includes receiving a global model from a global server associated with the global model. The method also includes transmitting the global model to user equipment (UEs) in a first group of UEs associated with the first zone model. The method further includes receiving, from one or more UEs in the first group, model updates associated with the global model based on transmitting the global model. The method further includes transmitting, to the global server, an average of the model updates received from the one or more UEs. The method also includes updating the global model to generate the first zone model based on the model updates. The method further includes transmitting the first zone model to one or more UEs in the first group.
H04L 41/16 - Dispositions pour la maintenance, l’administration ou la gestion des réseaux de commutation de données, p. ex. des réseaux de commutation de paquets en utilisant l'apprentissage automatique ou l'intelligence artificielle
G06F 18/214 - Génération de motifs d'entraînementProcédés de Bootstrapping, p. ex. ”bagging” ou ”boosting”
H04W 8/18 - Traitement de données utilisateur ou abonné, p. ex. services faisant l'objet d'un abonnement, préférences utilisateur ou profils utilisateurTransfert de données utilisateur ou abonné
H04W 8/20 - Transfert de données utilisateur ou abonné
A processor-implemented method includes receiving, by a user equipment (UE), a zone determination function based on registering for a federated learning process for training a first federated learning model. The method also includes determining, by the UE, a zone membership in accordance with UE parameters and the zone determination function. The method further includes selecting the first federated learning model, by the UE, based on the zone membership. The method includes training the first federated learning model by the UE.
A user equipment (UE) receives a SIB1 message from a base station. The SIB1 message lists first and second public land mobile network identifiers (PLMN IDs), the first PLMN ID having a corresponding tracking area code (TAC) and the second PLMN ID not having a TAC. The UE reports the first PLMN ID and TAC but not the second PLMN ID while performing PLMN selection. In another aspect, the UE unsuccessfully attempts to select or reselect to a shared cell of the base station with the second PLMN ID. In response to the failed attempt, the UE bars the shared cell as a candidate for cell selection/reselection, due to a missing TAC. When the UE attempts to select or reselect to the shared cell with the first PLMN ID, which may or may not have a TAC, the UE reevaluates the barring due to selection of the first PLMN ID.
A microphone including a casing having a front wall, a back wall, and a side wall joining the front wall to the back wall, a transducer mounted to the front wall, the transducer including a substrate and a transducing element, the transducing element having a transducer acoustic compliance dependent on the transducing element dimensions, a back cavity cooperatively defined between the back wall, the side wall, and the transducer, the back cavity having a back cavity acoustic compliance. The transducing element is dimensioned such that the transducing element length matches a predetermined resonant frequency and the transducing element width, thickness, and elasticity produces a transducer acoustic compliance within a given range of the back cavity acoustic compliance.
B81B 7/02 - Systèmes à microstructure comportant des dispositifs électriques ou optiques distincts dont la fonction a une importance particulière, p. ex. systèmes micro-électromécaniques [SMEM, MEMS]
H10N 30/30 - Dispositifs piézo-électriques ou électrostrictifs à entrée mécanique et sortie électrique, p. ex. fonctionnant comme générateurs ou comme capteurs
A computer-implemented method for contrastive object representation from temporal data using an artificial neural network (ANN) includes receiving, by the ANN, a video. The video comprises a temporal sequence of frames including images of one or more objects. The ANN generates object representations corresponding to the one or more objects based on temporal data of multiple frames of the temporal sequence of frames. The object representations are communicated to a receiver.
H04N 19/20 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage d'objets vidéo
G06V 10/776 - ValidationÉvaluation des performances
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/40 - ScènesÉléments spécifiques à la scène dans le contenu vidéo
H04N 19/136 - Caractéristiques ou propriétés du signal vidéo entrant
H04N 19/436 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques caractérisés par les détails de mise en œuvre ou le matériel spécialement adapté à la compression ou à la décompression vidéo, p. ex. la mise en œuvre de logiciels spécialisés utilisant des dispositions de calcul parallélisées
Systems, devices, methods, and implementations related to contact detection are described herein. In one aspect, a system is provided. The system includes a first piezoelectric microelectromechanical systems (MEMS) transducer coupled to configured to generate a first analog signal when the first analog signal is transduced from vibrations propagating through the object. The system includes a second piezoelectric MEMS transducer having configured to generate a second analog signal transduced from acoustic vibrations at a location of the object, and classification circuitry coupled to the output of first piezoelectric MEMS transducer and the output of the second piezoelectric MEMS transducer, where the classification circuitry is configured to process data from the first analog signal and data from the second analog signal, and to categorize combinations of the first analog signal and the second analog signal received during one or more time frames.
G01P 15/09 - Mesure de l'accélérationMesure de la décélérationMesure des chocs, c.-à-d. d'une variation brusque de l'accélération en ayant recours aux forces d'inertie avec conversion en valeurs électriques ou magnétiques au moyen de capteurs piézo-électriques
H10N 39/00 - Dispositifs intégrés, ou ensembles de plusieurs dispositifs, comportant au moins un élément piézo-électrique, électrostrictif ou magnétostrictif couvert par les groupes
H10N 30/30 - Dispositifs piézo-électriques ou électrostrictifs à entrée mécanique et sortie électrique, p. ex. fonctionnant comme générateurs ou comme capteurs
Aspects include piezoelectric acoustic transducers and systems for acoustic transduction. In some aspects, an acoustic transducer is structured with a silicon substrate having a top surface and a bottom surface, where the top surface has a first portion and an edge along the first portion associated with an acoustic aperture. The transducer has a first silicon oxide layer disposed over the first portion of the top surface of the silicon substrate, a polysilicon layer disposed over the first silicon oxide layer, and a second silicon oxide layer disposed over the polysilicon layer. A cantilevered beam comprising a fixed end, a deflection end, a top surface, and a bottom surface, has a first portion of the bottom surface at the fixed end disposed over the second silicon oxide layer, where a second portion of the bottom surface at the deflection end is formed over the acoustic aperture. In some aspects. transducer elements are reconfigurable between parallel and serial configurations depending on a system operating mode.
G01H 11/08 - Mesure des vibrations mécaniques ou des ondes ultrasonores, sonores ou infrasonores par détection des changements dans les propriétés électriques ou magnétiques par des moyens électriques utilisant des dispositifs piézo-électriques
B06B 1/06 - Procédés ou appareils pour produire des vibrations mécaniques de fréquence infrasonore, sonore ou ultrasonore utilisant l'énergie électrique fonctionnant par effet piézo-électrique ou par électrostriction
62.
SEMICONDUCTOR DEVICE COMPRISING A STACK OF CHIPS, AND CHIPS FOR SUCH A STACK
The invention relates to a semiconductor device (1) comprising a stack of chips (C1; C) arranged in successive levels along a stacking direction, each chip extending in a main plane perpendicular to the stacking direction. The stack (E) comprises a plurality of chips (C1) of a first type comprising a first portion (P1) and a second portion (P2) each extending in the main plane, the first portion (P1) being liable to release more heat than the second portion (P2) when the chip is operating. Each chip of the first type (C1) is arranged in mechanical contact with a chip in an adjacent level of the stack (E) by way of a stacking surface that extends only over its second portion (P2), such that its first portion (P1) forms a projecting part able to be exposed to a cooling fluid.
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
H01L 23/522 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre comprenant des interconnexions externes formées d'une structure multicouche de couches conductrices et isolantes inséparables du corps semi-conducteur sur lequel elles ont été déposées
H01L 23/528 - Configuration de la structure d'interconnexion
H01L 23/498 - Connexions électriques sur des substrats isolants
H01L 23/34 - Dispositions pour le refroidissement, le chauffage, la ventilation ou la compensation de la température
H01L 25/18 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant de types prévus dans plusieurs différents groupes principaux de la même sous-classe , , , , ou
H10B 80/00 - Ensembles de plusieurs dispositifs comprenant au moins un dispositif de mémoire couvert par la présente sous-classe
63.
CAUSAL REPRESENTATION LEARNING FOR INSTANTANEOUS TEMPORAL EFFECTS
A processor-implemented method for causal representation learning of temporal effects includes receiving, via an artificial neural network (ANN), temporal sequence data for high-dimensional observations. The ANN generates a latent representation based on latent variables for the temporal sequence data. The latent variables of the temporal sequence data are assigned to causal variables. The ANN determines a representation of causal factors for each dimension of the temporal sequence data based on the assignment.
Certain aspects of the present disclosure provide techniques for pose estimation for three-dimensional object reconstruction. In one example, a method, includes receiving image data, wherein the image data comprises a plurality of images taken from varying poses; identifying one or more pairs of spatially related images within the plurality of images; generating a synchronization graph indicative of at least one similarity metric between the plurality of images, based at least in part on the identified one of more pairs of spatially related images; and estimating a pose of an object depicted in the plurality of images based on the synchronization graph.
Aspects of transducers with feedback transduction are described. One aspect is a transducer system comprising an operational amplifier having an inverting input, a non-inverting input, and an output. The transducer system also includes a piezoelectric microelectromechanical system (MEMS) transducer having a first node and a second node, wherein the first node is coupled to the inverting input of the operational amplifier, and wherein the piezoelectric MEMS transducer is configured to generate an electrical signal across the first node and the second node in response to a signal incident upon the piezoelectric MEMS transducer. The transducer system also includes an attenuator having an input and an output, wherein the input of the attenuator is coupled to the output of the operational amplifier, and wherein the output of the attenuator is coupled to the second node of the piezoelectric MEMS transducer.
Aspects of acoustic transducers are described. One aspect is a microelectromechanical (MEMS) transducer comprising a substrate and multiple cantilevered beams. A first cantilevered beam comprises a first protrusion and a first piezoelectric structure, where the first piezoelectric structure comprises a first deflection end and a first fixed end, where the first fixed end is coupled to the substrate, and where the first deflection end is cantilevered away from the substrate. The first cantilevered beam is separated from a second cantilevered beam by a gap. The first protrusion is disposed at the first deflection end and increases a thickness of the first cantilevered beam along the gap at the first deflection end. A second protrusion of the second beam is disposed at a second deflection end and increases a thickness of the second cantilevered beam along the gap at the second deflection end.
A method for mobility and zone management in zone-based federated learning includes receiving, at a zone management device of multiple zone management devices, a global model from a first network device associated with the global model. Each of the multiple zone management devices is associated with a corresponding zone model of multiple zone models. The zone management device transmits the global model to mobile devices in a first zone associated with the first zone model based on a zone membership. The zone management device receives weights associated with the global model from each mobile device in the first zone. The zone management device updates the first zone model based on the received weights and the zone membership. The zone management device transmits the updated first zone model to each mobile device in the first zone.
A method for mobility and zone management in zone-based federated learning includes receiving, at a zone management device of multiple zone management devices, a global model from a first network device associated with the global model. Each of the multiple zone management devices is associated with a corresponding zone model of multiple zone models. The zone management device transmits the global model to mobile devices in a first zone associated with the first zone model based on a zone membership. The zone management device receives weights associated with the global model from each mobile device in the first zone. The zone management device updates the first zone model based on the received weights and the zone membership. The zone management device transmits the updated first zone model to each mobile device in the first zone.
A method of collaboratively training a neural network model, includes receiving a local update from a subset of the multiple users. The local update is related to one or more subsets of a dataset of the neural network model. A local component of the neural network model identifies a subset of the one or more subsets to which a data point belongs. A global update is computed for the neural network model based on the local updates from the subset of the users. The global updates for each portion of the network are aggregated to train the neural network model.
Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may divide at least one scene into a plurality of me shlets, each of the me shlets including a plurality of primitives, and each of the primitives including plurality of vertices. The apparatus may also calculate a pair of texture coordinates for each of the plurality of vertices. Further, the apparatus may select a size of each of the plurality of meshlets in the at least one scene based on the pair of the texture coordinates and based on a perspective projection of each of the plurality of meshlets. The apparatus may also calculate layout information in a meshlet atlas for each of the meshlets in the at least one scene. Moreover, the apparatus may shade each of a plurality of pixels in the meshlet atlas based on the calculated layout information.
Aspects are provided for multiband multiplexers. One example is a multiband multiplexer with a first filter element configured to have a first passband that spans a first predefined frequency range of a first communication band and a second predefined frequency range of a second communication band, wherein the first predefined frequency range overlaps a portion of the second predefined frequency range, a second filter element configured to have a second passband distinct from the first passband, a third filter element configured to have a third passband distinct from the first and second passbands, and a fourth filter element configured to have a fourth passband distinct from the first, second, and third passbands.
H04B 1/00 - Détails des systèmes de transmission, non couverts par l'un des groupes Détails des systèmes de transmission non caractérisés par le milieu utilisé pour la transmission
Aspects described herein provide a method of processing data, including: receiving a set of global parameters for a plurality of machine learning models; processing data stored locally on an processing device with the plurality of machine learning models according to the set of global parameters to generate a machine learning model output; receiving, at the processing device, user feedback regarding machine learning model output for the plurality of machine learning models; performing an optimization of the plurality of machine learning models based on the machine learning output and the user feedback to generate locally updated machine learning model parameters; sending the locally updated machine learning model parameters to a remote processing device; and receiving a set of globally updated machine learning model parameters for the plurality of machine learning models.
A method for managing model updates by a first network device includes receiving, at the first network device associated with a first zone model of multiple zone models, a global model from a second network device associated with the global model. The method also includes transmitting, from the first network device, the global model to user equipment (UEs) in a first group of UEs associated with the first zone model, a different group of UEs associated with each of the plurality of zone models. The method further includes receiving, at the first network device, weights associated with the global model from each UE in the first group. The method still further includes updating, at the first network device, the first zone model based on the received weights. The method also includes transmitting, from the first network device, the updated first zone model to each UE in the first group.
H04L 41/16 - Dispositions pour la maintenance, l’administration ou la gestion des réseaux de commutation de données, p. ex. des réseaux de commutation de paquets en utilisant l'apprentissage automatique ou l'intelligence artificielle
G06F 18/214 - Génération de motifs d'entraînementProcédés de Bootstrapping, p. ex. ”bagging” ou ”boosting”
H04W 8/18 - Traitement de données utilisateur ou abonné, p. ex. services faisant l'objet d'un abonnement, préférences utilisateur ou profils utilisateurTransfert de données utilisateur ou abonné
H04W 8/20 - Transfert de données utilisateur ou abonné
A method for human-object interaction detection includes receiving an image. A set of features are extracted from multiple positions of the image. One or more human-object pairs may be predicted based on the extracted set of features. A human-object interaction may be determined based on a set of candidate interactions and the predicted human-object pairs.
G06V 20/00 - ScènesÉléments spécifiques à la scène
G06V 40/10 - Corps d’êtres humains ou d’animaux, p. ex. occupants de véhicules automobiles ou piétonsParties du corps, p. ex. mains
G06V 10/75 - Organisation de procédés de l’appariement, p. ex. comparaisons simultanées ou séquentielles des caractéristiques d’images ou de vidéosApproches-approximative-fine, p. ex. approches multi-échellesAppariement de motifs d’image ou de vidéoMesures de proximité dans les espaces de caractéristiques utilisant l’analyse de contexteSélection des dictionnaires
G06F 18/241 - Techniques de classification relatives au modèle de classification, p. ex. approches paramétriques ou non paramétriques
G06F 18/213 - Extraction de caractéristiques, p. ex. en transformant l'espace des caractéristiquesSynthétisationsMappages, p. ex. procédés de sous-espace
A method for managing model updates by a first network device includes receiving, at the first network device associated with a first zone model of multiple zone models, a global model from a second network device associated with the global model. The method also includes transmitting, from the first network device, the global model to user equipment (UEs) in a first group of UEs associated with the first zone model, a different group of UEs associated with each of the plurality of zone models. The method further includes receiving, at the first network device, weights associated with the global model from each UE in the first group. The method still further includes updating, at the first network device, the first zone model based on the received weights. The method also includes transmitting, from the first network device, the updated first zone model to each UE in the first group.
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
A method for human-object interaction detection includes receiving an image. A set of features are extracted from multiple positions of the image. One or more human-object pairs may be predicted based on the extracted set of features. A human-object interaction may be determined based on a set of candidate interactions and the predicted human-object pairs.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 40/10 - Corps d’êtres humains ou d’animaux, p. ex. occupants de véhicules automobiles ou piétonsParties du corps, p. ex. mains
G06V 10/80 - Fusion, c.-à-d. combinaison des données de diverses sources au niveau du capteur, du prétraitement, de l’extraction des caractéristiques ou de la classification
Disclosed are apparatuses and methods for fabricating the apparatuses. In one aspect, an apparatus includes a high-power die mounted on a backside of a package substrate. A heat transfer layer is disposed on the backside of the high-power die. A plurality of heat sink interconnects is coupled to the heat transfer layer, where each of the plurality of heat sink interconnects is directly coupled to the heat transfer layer in a vertical orientation.
Disclosed are apparatuses and methods for fabricating the apparatuses. In one aspect, an apparatus includes a high-power die mounted on a backside of a package substrate. A heat transfer layer is disposed on the backside of the high-power die. A plurality of heat sink interconnects is coupled to the heat transfer layer. The plurality of heat sink interconnects is located adjacent the high-power die in a horizontal direction.
H01L 23/367 - Refroidissement facilité par la forme du dispositif
H01L 21/56 - Encapsulations, p. ex. couches d’encapsulation, revêtements
H01L 23/31 - Encapsulations, p. ex. couches d’encapsulation, revêtements caractérisées par leur disposition
H01L 23/373 - Refroidissement facilité par l'emploi de matériaux particuliers pour le dispositif
H01L 23/522 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre comprenant des interconnexions externes formées d'une structure multicouche de couches conductrices et isolantes inséparables du corps semi-conducteur sur lequel elles ont été déposées
H03F 3/213 - Amplificateurs de puissance, p. ex. amplificateurs de classe B, amplificateur de classe C comportant uniquement des dispositifs à semi-conducteurs dans des circuits intégrés
H01L 21/60 - Fixation des fils de connexion ou d'autres pièces conductrices, devant servir à conduire le courant vers le ou hors du dispositif pendant son fonctionnement
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
Disclosed are apparatuses and methods for fabricating the apparatuses. In one aspect, an apparatus includes a high-power die mounted on a backside of a package substrate. A heat transfer layer is disposed on the backside of the high-power die. A plurality of heat sink interconnects is coupled to the heat transfer layer. The plurality of heat sink interconnects is located adjacent the high-power die in a horizontal direction.
A method comprising for generating an equivariant neural network includes receiving a set of irreducible representations for an origin-preserving group. A network that is equivariant to the origin-preserving group is dynamically generated based on the set of irreducible representation.
A method comprising for generating an equivariant neural network includes receiving a set of irreducible representations for an origin-preserving group. A network that is equivariant to the origin-preserving group is dynamically generated based on the set of irreducible representation.
Systems and techniques are provided for determining one or more poses of one or more objects. For example, a process can include determining, using a machine learning system, a plurality of keypoints from an image. The plurality of keypoints are associated with at least one object in the image. The process can include determining a plurality of features from the machine learning system based on the plurality of keypoints. The process can include classifying the plurality of features into a plurality of joint types. The process can include determining pose parameters for the at least one object based on the plurality of joint types.
G06V 40/20 - Mouvements ou comportement, p. ex. reconnaissance des gestes
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
83.
PACKAGE COMPRISING METAL LAYER CONFIGURED FOR ELECTROMAGNETIC INTERFERENCE SHIELD AND HEAT DISSIPATION
A package that includes a substrate (202), an integrated device (204, 206, 208) coupled to the substrate, an encapsulation layer (209) located over the substrate, at least one encapsulation layer interconnect (211, 212) located in the encapsulation layer (209), and a metal layer (210) located over the encapsulation layer. The substrate (202) includes at least one dielectric layer (220) and a plurality of interconnects (221). The encapsulation layer interconnect (211, 212) is coupled to the substrate (202). The metal layer (210) is configured as an electromagnetic interference (EMF) shield for the package. The metal layer is located over a backside of the integrated device (204, 206, 208).
H01L 23/552 - Protection contre les radiations, p. ex. la lumière
H01L 23/433 - Pièces auxiliaires caractérisées par leur forme, p. ex. pistons
H01L 21/60 - Fixation des fils de connexion ou d'autres pièces conductrices, devant servir à conduire le courant vers le ou hors du dispositif pendant son fonctionnement
84.
Package comprising metal layer configured for electromagnetic interference shield and heat dissipation
A package that includes a substrate, an integrated device coupled to the substrate, an encapsulation layer located over the substrate, at least one encapsulation layer interconnect located in the encapsulation layer, and a metal layer located over the encapsulation layer. The substrate includes at least one dielectric layer and a plurality of interconnects. The encapsulation layer interconnect is coupled to the substrate. The metal layer is configured as an electromagnetic interference (EMI) shield for the package. The metal layer is located over a backside of the integrated device.
H01L 23/552 - Protection contre les radiations, p. ex. la lumière
H01L 21/48 - Fabrication ou traitement de parties, p. ex. de conteneurs, avant l'assemblage des dispositifs, en utilisant des procédés non couverts par l'un uniquement des groupes ou
H01L 21/56 - Encapsulations, p. ex. couches d’encapsulation, revêtements
H01L 21/768 - Fixation d'interconnexions servant à conduire le courant entre des composants distincts à l'intérieur du dispositif
H01L 23/367 - Refroidissement facilité par la forme du dispositif
H01L 23/373 - Refroidissement facilité par l'emploi de matériaux particuliers pour le dispositif
H01L 23/48 - Dispositions pour conduire le courant électrique vers le ou hors du corps à l'état solide pendant son fonctionnement, p. ex. fils de connexion ou bornes
H01L 23/49 - Dispositions pour conduire le courant électrique vers le ou hors du corps à l'état solide pendant son fonctionnement, p. ex. fils de connexion ou bornes formées de structures soudées du type fils de connexion
A computer-implemented method for tracking with visual object constraints includes receiving a lingual constraint and a video. A word embedding is generated based on the lingual constraint. A set of features is extracted for one or more frames of the video. The word embedding is cross-correlated to the set of features for the one or more frames of the video. A prediction indicating whether the lingual constraint is in the one or more frames of the video is generated based on the cross-correlation.
G06N 3/04 - Architecture, p. ex. topologie d'interconnexion
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
G06V 10/94 - Architectures logicielles ou matérielles spécialement adaptées à la compréhension d’images ou de vidéos
Certain aspects of the present disclosure provide techniques for training a first model based on a first labeled video dataset; generating a plurality of action-words based on output generated by the first model processing motion data in videos of an unlabeled video dataset; defining labels for the videos in the unlabeled video dataset based on the generated action-words; and training a second model based on the labels for the videos in the unlabeled video dataset.
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06K 9/46 - Extraction d'éléments ou de caractéristiques de l'image
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
A computer-implemented method for tracking with visual object constraints includes receiving a lingual constraint and a video. A word embedding is generated based on the lingual constraint. A set of features is extracted for one or more frames of the video. The word embedding is cross-correlated to the set of features for the one or more frames of the video. A prediction indicating whether the lingual constraint is in the one or more frames of the video is generated based on the cross-correlation.
G06V 20/40 - ScènesÉléments spécifiques à la scène dans le contenu vidéo
G06V 10/62 - Extraction de caractéristiques d’images ou de vidéos relative à une dimension temporelle, p. ex. extraction de caractéristiques axées sur le tempsSuivi de modèle
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
88.
Video processing using a spectral decomposition layer
A method is presented. The method includes receiving a first sequence of frames depicting a dynamic element. The method also includes decomposing each spatial position from multiple spatial positions in the first sequence of frames to a frequency domain. The method further includes determining a distribution of spectral power density over a range of frequencies of the multiple spatial positions. The method still further includes generating a first set of feature maps based on the determined distribution of spectral power density over the range of frequencies. The method still further includes estimating a first physical property of the dynamic element.
The present disclosure relates to methods and apparatus for graphics processing at a server and/or a client device. In some aspects, the apparatus may convert application data for at least one frame, the application data corresponding to one or more image functions or one or more data channels. The apparatus may also encode the application data for the at least one frame, the application data being associated with a data stream, the application data being encoded via a video encoding process. The apparatus may also transmit the encoded application data for the at least one frame. Additionally, the apparatus may receive application data for at least one frame, the application data being associated with a data stream. The apparatus may also decode the application data for the at least one frame; and convert the application data for the at least one frame.
H04N 19/59 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage prédictif mettant en œuvre un sous-échantillonnage spatial ou une interpolation spatiale, p. ex. modification de la taille de l’image ou de la résolution
H04N 19/597 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage prédictif spécialement adapté pour l’encodage de séquences vidéo multi-vues
The present disclosure relates to methods and apparatus for graphics processing. The apparatus may identify at least one mesh associated with at least one frame. The apparatus may also divide the at least one mesh into a plurality of groups of primitives, each of the plurality of groups of primitives including at least one primitive and a plurality of vertices. The apparatus may also compress the plurality of groups of primitives into a plurality of groups of compressed primitives, the plurality of groups of compressed primitives being associated with random access. Additionally, the apparatus may decompress the plurality of groups of compressed primitives, at least one first group of the plurality of groups of compressed primitives being decompressed in parallel with at least one second group of the plurality of groups of compressed primitives.
The present disclosure relates to methods and apparatus for graphics processing. The apparatus may configure a plurality of billboards associated with a viewpoint of a first frame of a plurality of frames, the plurality of billboards being configured in one or more layers at least partially around the viewpoint, the configuration of the plurality of billboards being based on one or more volumetric elements between at least one of the plurality of billboards and the viewpoint. The apparatus may also render an image associated with each of the one or more volumetric elements between at least one billboard of the plurality of billboards and the viewpoint, the rendered image including a set of pixels. The apparatus may also store data in the at least one billboard based on the rendered image associated with each of the one or more volumetric elements, the data corresponding to the set of pixels.
A method of protecting a DRAM memory device from the row hammer effect, the memory device comprising a plurality of banks composed of memory rows, may be implemented by at least one logic prevention device configured to respectively associate contiguous sections of rows of a bank with sub-banks. The prevention logic is also configured to execute a preventive refresh cycle of the sub-banks that is entirely executed before the number of rows activated in a sub-bank exceed a critical hammer value. A DRAM memory device, a buffer circuit or a controller of such a memory may comprise the logic for preventing the row hammer effect.
G11C 11/406 - Organisation ou commande des cycles de rafraîchissement ou de régénération de la charge
G11C 11/4078 - Circuits de sécurité ou de protection, p. ex. afin d'empêcher la lecture ou l'écriture intempestives ou non autoriséesCellules d'étatCellules de test
G11C 29/50 - Test marginal, p. ex. test de vitesse, de tension ou de courant
A method of collaboratively training a neural network model, includes receiving a local update from a subset of the multiple users. The local update is related to one or more subsets of a dataset of the neural network model. A local component of the neural network model identifies a subset of the one or more subsets to which a data point belongs. A global update is computed for the neural network model based on the local updates from the subset of the users. The global updates for each portion of the network are aggregated to train the neural network model.
A calculation system comprises a computing device having one or more instruction-controlled processing cores and a memory controller, the memory controller including a cache memory; and a memory circuit coupled to the memory controller via a data bus and an address bus, the memory circuit being adapted to have a first m-bit memory location accessible by a plurality of first addresses provided on the address bus, the calculation device being configured to select, in order to each memory operation accessing the first m-bit memory location, one address among the plurality first addresses.
G06F 12/0888 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache utilisant la mémorisation cache sélective, p. ex. la purge du cache
95.
MANAGING OCCLUSION IN SIAMESE TRACKING USING STRUCTURED DROPOUTS
A method for object tracking includes receiving a target image of an object of interest. Latent space features of the target image is modified at a forward pass for a neural network by dropping at least one channel of the latent space features, dropping a channel corresponding to a slice of the latent space features, or dropping one or more features of the latent space features. At the forward pass, a location of the object of interest in a search image is predicted based on the modified latent space features. The location of the object of interest is identified by aggregating predicted locations from the forward pass.
Certain aspects of the present disclosure provide a method for performing machine learning, comprising: determining a plurality of vertices in a neighborhood associated with a mesh including a target vertex; determining a linear transformation configured to parallel transport signals along all edges in the mesh to the target vertex; applying the linear transformation to the plurality of vertices in the neighborhood to form a combined signal at the target vertex; determining a set of basis filters; linearly combining the basis filters using a set of learned parameters to form a gauge equivariant convolution filter, wherein the gauge equivariant convolution filter is constrained to maintain gauge equivariance; applying the gauge equivariant convolution filter to the combined signal to form an intermediate output; and applying a nonlinearity to the intermediate output to form a convolution output.
G06V 10/44 - Extraction de caractéristiques locales par analyse des parties du motif, p. ex. par détection d’arêtes, de contours, de boucles, d’angles, de barres ou d’intersectionsAnalyse de connectivité, p. ex. de composantes connectées
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p. ex. des objets vidéo
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
Certain aspects of the present disclosure provide a method for performing machine learning, comprising: determining a plurality of vertices in a neighborhood associated with a mesh including a target vertex; determining a linear transformation configured to parallel transport signals along all edges in the mesh to the target vertex; applying the linear transformation to the plurality of vertices in the neighborhood to form a combined signal at the target vertex; determining a set of basis filters; linearly combining the basis filters using a set of learned parameters to form a gauge equivariant convolution filter, wherein the gauge equivariant convolution filter is constrained to maintain gauge equivariance; applying the gauge equivariant convolution filter to the combined signal to form an intermediate output; and applying a nonlinearity to the intermediate output to form a convolution output.
Aspects described herein provide a method of processing data, including: receiving a set of global parameters for a plurality of machine learning models; processing data stored locally on an processing device with the plurality of machine learning models according to the set of global parameters to generate a machine learning model output; receiving, at the processing device, user feedback regarding machine learning model output for the plurality of machine learning models; performing an optimization of the plurality of machine learning models based on the machine learning output and the user feedback to generate locally updated machine learning model parameters; sending the locally updated machine learning model parameters to a remote processing device; and receiving a set of globally updated machine learning model parameters for the plurality of machine learning models.
A method for classifying a human-object interaction includes identifying a human-object interaction in the input. Context features of the input are identified. Each identified context feature is compared with the identified human-object interaction. An importance of the identified context feature is determined for the identified human-object interaction. The context feature is fused with the identified human-object interaction when the importance is greater than a threshold.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/46 - Extraction d'éléments ou de caractéristiques de l'image
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
100.
PERMUTATION INVARIANT CONVOLUTION (PIC) FOR RECOGNIZING LONG-RANGE ACTIVITIES
A method for recognizing long-range activities in videos includes segmenting an input video stream to generate multiple frame sets. For each of the frame sets, a frame with a highest likelihood of including one or more actions of a set of predefined actions is identified regardless of its order in the frame set. A global representation of the input stream is generated based on pooled representations of the identified frames. A long-range activity in the video stream is classified based on the global representation.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
G06K 9/46 - Extraction d'éléments ou de caractéristiques de l'image
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques