Apparatuses, systems, and techniques apply to a force-based (e.g., primal) formulation for object simulation. In at least one embodiment, updates to the force-based formulation is determined by solving for constraints that are to be satisfied when simulating rigid bodies (e.g., contact rich scenarios).
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
Apparatuses, systems, and techniques for testing processing units of a computing system are disclosed herein. A request to initiate a testing process for each of a set of processing units is received. A first processing unit includes a first virtual processor executing first operations. A second processing unit includes a second virtual processor executing second operations. Execution of the first operations is transferred from the first virtual processor to the second virtual processor. Execution of the testing process is initiated at the first processing unit while the second virtual processor executes the first and second operations. In response to a detection that the execution of the testing process is completed, execution of the first and second operations is transferred to the first virtual processor. Execution of the testing process is initiated at the second processing unit while the first virtual processor running executes the first and second operations.
In various examples, a two-dimensional (2D) and three-dimensional (3D) deep neural network (DNN) is implemented to fuse 2D and 3D object detection results for classifying objects. For example, regions of interest (ROIs) and/or bounding shapes corresponding thereto may be determined using one or more region proposal networks (RPNs)—such as an image-based RPN and/or a depth-based RPN. Each ROI may be extended into a frustum in 3D world-space, and a point cloud may be filtered to include only points from within the frustum. The remaining points may be voxelated to generate a volume in 3D world space, and the volume may be applied to a 3D DNN to generate one or more vectors. The one or more vectors, in addition to one or more additional vectors generated using a 2D DNN processing image data, may be applied to a classifier network to generate a classification for an object.
G06T 7/521 - Depth or shape recovery from laser ranging, e.g. using interferometryDepth or shape recovery from the projection of structured light
G06T 15/00 - 3D [Three Dimensional] image rendering
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
Adaptive clock mechanisms for serial links utilizing a delay-chain-based edge generation circuit to generate a clock that is a faster (higher-frequency) version of an incoming digital clock. The base frequency of the link clock utilized by the line transmitters is determined by the (slower) clock utilized by the digital circuitry supplying data to the line transmitters. An edge generator that may be composed of only non-synchronous circuit elements multiplies the edges of the slower clock to generate the link clock and also a clock forwarded to the receiver at a phase offset from the link clock.
H04L 7/00 - Arrangements for synchronising receiver with transmitter
H04L 7/033 - Speed or phase control by the received code signals, the signals containing no special synchronisation information using the transitions of the received signal to control the phase of the synchronising-signal- generating means, e.g. using a phase-locked loop
5.
IDENTIFYING APPLICATION BUFFERS FOR POST-PROCESSING AND RE-USE IN SECONDARY APPLICATIONS
Apparatuses, systems, and techniques for buffer identification of an application for post-processing. The apparatuses, systems, and techniques includes generating a buffer statistic data structure for a buffer of a plurality of buffers associated with a frame of an application; updating the buffer statistic data structure with metadata of the draw call responsive to detecting a draw call to the buffer, and determining, based on the buffer statistic data structure, a score reflecting a likelihood of the buffer being associated with a specified buffer type.
Apparatuses, systems, and techniques to compute cyclic redundancy checks use a graphics processing unit (GPU) to compute cyclic redundancy checks. For example, in at least one embodiment, an input data sequence is distributed among GPU threads for parallel calculation of an overall CRC value for the input data sequence according to various novel techniques described herein.
Optical center is determined on a column-by-column and row-by-row basis by identifying brightest pixels in respective columns and rows. The brightest pixels in each column are identified and a line is fit to those pixels. Similarly, brightest pixels in each row are identified and a second line is fit to those pixels. The intersection of the two lines is the optical center.
G06T 7/80 - Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
H04N 25/61 - Noise processing, e.g. detecting, correcting, reducing or removing noise the noise originating only from the lens unit, e.g. flare, shading, vignetting or "cos4"
8.
MODELING EQUIVARIANCE IN FEATURES AND PARTITIONS USING NEURAL NETWORKS FOR THREE-DIMENSIONAL OBJECT DETECTION AND RECOGNITION
In various examples, a technique for modeling equivariance in point neural networks includes generating, via execution of one or more layers included in a neural network, a set of features associated with a first partition prediction for a plurality of points included in a scene. The technique also includes applying, to the set of features, one or more transformations included in a frame associated with the plurality of points to generate a set of equivariant features. The technique further includes generating a second partition prediction for the plurality of points based at least on the set of equivariant features, and causing an object recognition result associated with the plurality of points to be generated based at least on the second partition prediction.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
9.
USING SPECIAL TOKENS FOR SECURE PROMPT TEMPLATE INPUT TO LANGUAGE MODELS
Systems and methods provide for a prompt template to include private random strings that are used by a language model to reference specific tokens on which the language model has been trained. A number of random strings may be generated and inserted into a prompt template and assigned special token identifiers (IDs) in a tokenizer. The text data from the prompt template are tokenized to convert the random strings to the assigned special token IDs which are sent to the language model for inferencing. Based on the provided special token IDs, the language model may reference special tokens learned from training and then generates an inference. The random strings remain hidden during their lifetime and may be updated on-demand to ensure security.
Apparatuses, systems, and techniques are presented to generate ultrasound images. In at least one embodiment, use one or more neural networks are used to generate one or more ultrasound images of one or more objects based, at least in part, upon one or more acoustic properties of the one or more objects.
Disclosed are systems and methods relating to extracting 3D features, such as bounding boxes. The systems can apply, to one or more features of a source image that depicts a scene using a first set of camera parameters, based on a condition view image associated with the source image, an epipolar geometric warping to determine a second set of camera parameters. The systems can generate, using a neural network, a synthetic image representing the one or more features and corresponding to the second set of camera parameters.
In various examples, a technique for modeling equivariance in point neural networks includes determining a first partition prediction associated with partitioning of a plurality of points included in a scene into a first set of parts. The technique also includes generating, using a neural network, a second partition prediction associated with partitioning of the plurality of points into a second set of parts based at least on one or more aggregations associated with the first set of parts. The technique further includes determining a plurality of piecewise equivariant regions included in the scene based on the second partition prediction and generating an object recognition result associated with the plurality of points based on the plurality of piecewise equivariant regions.
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
13.
DETERMINING OPERATIONAL CAPABILITY FOR HUMAN-OPERATED SYSTEMS AND CONTROL APPLICATIONS
Approaches presented herein provide for the automated determination of a level of impairment of a person, as may be relevant to the performance of a task. A light and camera-based system can be used to determine factors such as gaze nystagmus that are indicative of inebriation or impairment. A test system can simulate motion of a light using a determined pattern, and capture image data of at least the eye region of a person attempting to follow the motion. The captured image data can be analyzed using a neural network to infer at least one behavior of the user, and the behavior determination(s) can be used to determine a capacity or level of impairment of a user. An appropriate action can be taken, such as to allow a person with full capacity to operate a vehicle or perform a task, or to block access to such operation or performance if the person is determined to be impaired beyond an allowable amount.
B60W 50/12 - Limiting control by the driver depending on vehicle state, e.g. interlocking means for the control input for preventing unsafe operation
B60W 40/08 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to drivers or passengers
B60W 50/14 - Means for informing the driver, warning the driver or prompting a driver intervention
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/59 - Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
G06V 40/18 - Eye characteristics, e.g. of the iris
G06V 40/60 - Static or dynamic means for assisting the user to position a body part for biometric acquisition
14.
DIFFERENTIAL TRANSIMPEDANCE AMPLIFIER FROM A SINGLE PHOTODIODE
An integrated circuit is disclosed. The integrated circuit comprises an input interface and an optical receiver. The optical receiver includes a photodiode, level shifter, and differential transimpedance amplifier (TIA). The photodiode has a cathode and anode terminal and is configured to receive an optical signal via the input interface. The level shifter includes a parallel RC circuit. The differential TIA has first and second conversion circuits. The first conversion circuit is connected to the cathode terminal and a first output terminal of the optical receiver. The second conversion circuit is connected between the anode terminal and a second output terminal of the optical receiver. The parallel RC circuit is connected between the cathode terminal of the photodiode and the first conversion circuit. The differential TIA is configured to provide a differential voltage signal at the first and second output terminals of the optical receiver based on the optical signal.
In various examples, systems and methods are disclosed relating to improving perceived video quality through temporal redistribution of network packet payloads that may carry error mitigation data. A subset of network packets for an encoded video stream is identified from a sequence of network packets as corresponding to a region of a video frame of the encoded video stream. A transmission order for the sequence of network packets is determined based at least on the subset of network packets and one or more error correction packets corresponding to the sequence of network packets. The sequence of network packets is transmitted to a receiver client device according to the transmission order.
H04N 21/4402 - Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
16.
ERROR CONCEALMENT IN VIDEO FRAMES FOR VIDEO STREAMING SYSTEMS AND APPLICATIONS
In various examples, systems and methods are disclosed relating to error concealment by replacing a lost video frame region with a chronological predecessor. Network packets including data corresponding to an encoded bitstream of a frame of a video stream can be received. In response to determining that at least one packet of the video stream has been lost, a region of the video frame corresponding to the lost network packet can be replaced with the same region of a previous frame of the video stream.
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
H04L 65/65 - Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
H04L 69/16 - Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
H04N 19/174 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
An integrated circuit includes logic circuitry configured to output serialized data from a plurality of precoded data inputs without use of a clock signal. The integrated circuit further includes a precoder circuit responsive to the clock signal to generate the plurality of precoded data inputs based on timing of data symbols positioned within a plurality of data inputs.
A reference generation circuit includes a proportional-to-absolute temperature (PTAT) current branch having a pair of diodes and a complementary-to-absolute temperature (CTAT) current branch having a pair of resistors. A first bank of diode-connected transistors is positioned in the PTAT branch and a second bank of diode-connected transistors is positioned in the CTAT branch. Diode-connected transistors of the first and second banks of transistors are selectable to tune a gate-source voltage of each of the first and second banks of diode-connected transistors.
G05F 3/24 - Regulating voltage or current wherein the variable is DC using uncontrolled devices with non-linear characteristics being semiconductor devices using diode-transistor combinations wherein the transistors are of the field-effect type only
G05F 1/46 - Regulating voltage or current wherein the variable actually regulated by the final control device is DC
19.
SYNCHRONIZED TWO-WAY DIRECT CURRENT (DC) DATA-BUS-INVERSION ENCODING FOR SIMULTANEOUS BI-DIRECTIONAL SIGNALING
A system includes first transceivers coupled to data lanes, which are coupled to second transceivers and a first encoder coupled to the first transceivers. The first encoder, responsive to detecting a transmission signal to begin a transmission mode, determines that first bits to be transmitted by the first transceivers over the data lanes include over fifty percent of a first binary value. The first encoder generates a first data-bus-inversion (DBI) polarity signal that alternates in polarity and generates first DBI-encoded bits of the first of bits based on the first DBI polarity signal. The first encoder causes the transmission signal to be transmitted to a second encoder coupled to the second transceivers, the transmission signal to synchronize DBI encoding between the first and second encoders.
A process to ameliorate scoreboard aliasing in multi-threaded data processors whereby, in response to executing at least one long-latency instruction in a first thread, a shared hardware scoreboard is incremented. A shared software register is incremented and the shared software register is spilled to a first per-thread register, and execution is switched to a second thread. After execution switches back to the first thread, execution of the first thread is suspended until the shared hardware scoreboard reaches a value at or below a difference between a value in the shared software register and the value spilled into the first per-thread register.
In various examples, a technique for end-to-end camera view verification for automotive systems and applications includes computing a checksum for at least a portion of a frame to be displayed on a screen, wherein the frame comprises one or more views captured using one or more cameras. The technique also includes receiving a sequence of checksums associated with one or more previous frames displayed on the screen. The technique further includes updating one or more counters based on one or more comparisons of the computed checksum and the sequence of checksums and causing an alert associated with display of the frame on the screen to be generated based on the one or more counters.
One embodiment of a method for generating an articulation model includes receiving a first set of images of an object in a first articulation and a second set of images of the object in a second articulation, performing one or more operations to generate first three-dimensional (3D) geometry based on the first set of images, performing one or more operations to generate second 3D geometry based on the second set of images, and performing one or more operations to generate an articulation model of the object based on the first 3D geometry and the second 3D geometry.
systems, computer program products, and methods are described for an endpoint device configured for secure data transmission within a network. An example endpoint device may include a network interface configured to receive a communication request from a peer endpoint device, and an access control unit configured to determine whether a peer endpoint device is IDE qualified based on the communication request. If the peer endpoint device is IDE qualified, the access control unit authorizes the communication request, allowing secure communication between the devices. If the peer endpoint device is not IDE qualified, the access control unit transmits the communication request to a root port for further authorization, verifying that only IDE-qualified devices are permitted to communicate directly.
H04L 9/32 - Arrangements for secret or secure communicationsNetwork security protocols including means for verifying the identity or authority of a user of the system
Disclosed are apparatuses, systems, and techniques that provide virtual immersion sound experience and spatialization effects with an audio device supporting a low number of sound channels, according to at least one embodiment. The techniques include but are not limited to associating input audio channels of an audio stream with virtual speakers, identifying, using an optical sensor, positioning of a user's head relative to the virtual speakers, determining simulated sound intensities at one or more reference locations associated with the user's head, and generating, based on the simulated sound intensities, output audio signals configured for physical speakers.
A scan island for automated data retrieval from an integrated circuit (IC) includes a data extraction module configured to extract data from multiple scan chains and random-access memory (RAM) modules in response to a trigger event, storing the data in an external non-volatile storage medium. The scan island further comprises a clock and reset module, which includes a free-running independent clock to enable continuous operation of the scan island upon occurrence of the trigger event, and a local reset module that re-initializes the scan island in a known state following the trigger event without external intervention. The scan island operates as an isolated partition within the IC, ensuring secure and autonomous data retrieval and recovery processes.
Vector-scaled hierarchical codebook quantization reduces precision (bitwidth) vectors of parameters and may enable energy-efficient acceleration of deep neural networks. A vector (block array) comprises one or more parameters within a single dimension of a multi-dimensional tensor (or kernel). For example, block array comprises 4 sub-vectors (blocks) and each sub-vector comprises 8 parameters. The parameters may be represented in integer, floating-point, or any other suitable format. A vector cluster quantization technique is used to quantize blocks of parameters in real-time. Hardware circuitry within a datapath identifies an optimal codebook of a plurality of codebooks for quantizing each block of parameters and the block is encoded using the identified codebook. During processing, the identified codebook is used to obtain the quantized parameter and perform computations at the reduced precision.
H03M 13/09 - Error detection only, e.g. using cyclic redundancy check [CRC] codes or single parity bit
H03M 13/00 - Coding, decoding or code conversion, for error detection or error correctionCoding theory basic assumptionsCoding boundsError probability evaluation methodsChannel modelsSimulation or testing of codes
27.
COMBINING RULE-BASED AND LEARNED SENSOR FUSION FOR AUTONOMOUS SYSTEMS AND APPLICATIONS
In various examples, systems and methods are disclosed that perform sensor fusion using rule-based and learned processing methods to take advantage of the accuracy of learned approaches and the decomposition benefits of rule-based approaches for satisfying higher levels of safety requirements. For example, in-parallel and/or in-serial combinations of early rule-based sensor fusion, late rule-based sensor fusion, early learned sensor fusion, or late learned sensor fusion may be used to solve various safety goals associated with various required safety levels at a high level of accuracy and precision. In embodiments, learned sensor fusion may be used to make more conservative decisions than the rule-based sensor fusion (as determined using, e.g., severity (S), exposure (E), and controllability (C) (SEC) associated with a current safety goal), but the rule-based sensor fusion may be relied upon where the learned sensor fusion decision may be less conservative than the corresponding rule-based sensor fusion.
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
G01S 13/86 - Combinations of radar systems with non-radar systems, e.g. sonar, direction finder
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
28.
MULTI-MODAL SENSOR FUSION FOR CONTENT IDENTIFICATION IN APPLICATIONS OF HUMAN-MACHINE INTERFACES
Interactions with virtual systems may be difficult when users inadvertently fail to provide sufficient information to proceed with their requests. Certain types of inputs, such as auditory inputs, may lack sufficient information to properly provide a response to the user. Additional information, such as image data, may enable user gestures or poses to supplement the auditory inputs to enable response generation without requesting additional information from users.
In various examples, systems and methods are disclosed relating to bandwidth preservation through selective application of error mitigation techniques for video frame regions. A subset of network packets for a video stream are identified as corresponding to an encoded region of a video frame of the video stream. At least one error correction packet is transmitted for the subset that encodes the region of the video frame. The network packets and the at least one error correction packet can be transmitted to a receiver device.
H04N 21/442 - Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed or the storage space available from the internal hard disk
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/146 - Data rate or code amount at the encoder output
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/174 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
H04N 19/65 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using error resilience
H04N 21/462 - Content or additional data management e.g. creating a master electronic program guide from data received from the Internet and a Head-end or controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
Apparatuses, systems, and techniques to train one or more neural networks using stratified sampled training data parameters. In at least one embodiment, one or more stochastic training data parameters may be stratified sampled from one or more sampling ranges to compute a gradient for updating the one or more neural networks.
In various examples, a technique for end-to-end telltale verification for automotive systems and applications includes receiving, from a buffer, a set of commands associated with a frame to be displayed on a screen. The technique also includes determining, based at least on the set of commands, (i) an expected checksum for a telltale to be included in the frame and (ii) at least a portion of the frame associated with the telltale. The technique further includes computing a checksum for the at least the portion of the frame, and causing an alert associated with the telltale to be generated based at least on a comparison of the computed checksum with the expected checksum.
Apparatuses, systems, and techniques to prevent information from being read from a second cache location while information is being stored in a first cache location. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to prevent information from being read from a second cache location while information is being stored in a first cache location.
G06F 12/0811 - Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
G06F 12/0804 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
33.
APPLICATION PROGRAMMING INTERFACE TO INVALIDATE INFORMATION
Apparatuses, systems, and techniques to cause information to be invalidated in a second cache location after information is stored in a first cache location. In at least one embodiment, one or more circuits are to perform an application programming interface (API) to cause information to be invalidated in a second cache location after information is stored in a first cache location.
G06F 12/0891 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using clearing, invalidating or resetting means
G06F 12/0811 - Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
34.
ASSOCIATING TRAFFIC OBJECTS WITH SUPPORTING STRUCTURES IN MAPS FOR AUTONOMOUS SYSTEMS AND APPLICATIONS
In various examples, associating traffic objects with traffic poles or other supporting structures in maps for autonomous systems and applications is described herein. Systems and methods are disclosed that associate traffic objects (e.g., traffic signals, traffic signs, etc.) with traffic poles within maps and/or generate structures that represent the associations within the maps. For instance, a map may indicate poses of one or more traffic objects and/or a traffic pole within an environment. As such, the poses may be used to associate the traffic object(s) with the traffic pole, such as by using one or more threshold distances. Next, the poses, the association(s), and/or general information associated with traffic poles may be used to generate a structure that represents the traffic object(s) connected to the traffic pole.
In various examples, disclosed techniques use a banding detector neural network to identify the locations and sizes of banding artifacts in pixel regions of an input image. The neural network generates a band size map that identifies at least one banding artifact in the input image. The band size map can include a set of predicted band size values, each of which corresponds to a respective pixel of the input image and represents a distance between edges of a banding artifact. The band size map and the input image are provided as input to a stochastic bilateral blur filter, which generates a de-banded image by applying blurring effects to the input image at the band locations indicated by the band size map. An inverse tone mapping operation is then performed to convert the de-banded image to an image that does not have banding artifacts.
Approaches presented herein provide systems and methods for identifying and removing overlay elements from one or more frames in a video sequence. A frame may be evaluated to identify one or more overlay elements and a mask may be generated, or acquired, to identify one or more regions of the frame associated with the one or more overlay elements. The initial frame and mask may be used as an input to one or more neural networks to remove the overlay elements and replace the overlay elements with generated content associated with an underlying scene in the frame. A reconstructed frame may then be generated and inserted into the video sequence, replacing the frame, for view on a display.
G06T 11/60 - Editing figures and textCombining figures or text
G06T 3/40 - Scaling of whole images or parts thereof, e.g. expanding or contracting
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
37.
INFRARED AND OTHER COLORIZATION WITH RGB IMAGE DATA USING GENERATIVE NEURAL NETWORKS
In various examples, infrared image data (e.g., frames of an infrared (IR) video feed) may be colorized by transferring color statistics from an RGB image with an overlapping field of view, by modifying one or more dimensions of an encoded representation of a generated RGB image, and/or otherwise. For example, segmentation may be applied to the IR and RGB image data, and the one or more colors or statistics may be transferred from a segmented region of the RGB image data to a corresponding segmented region of the IR image data. In some embodiments, synthesized RGB image data may be fined tuned by transferring color or color statistic(s) from corresponding real RGB image data, and/or by modifying one or more dimensions of an encoded representation of the synthesized RGB image data.
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/59 - Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
38.
TECHNIQUES FOR IDENTIFICATION OF OUT-OF-DISTRIBUTION INPUT DATA IN NEURAL NETWORKS
Apparatuses, systems, and techniques to identify out-of-distribution input data in one or more neural networks. In at least one embodiment, a technique includes training one or more neural networks to infer a plurality of characteristics about input information based, at least in part, on the one or more neural networks being independently trained to infer each of the plurality of characteristics about the input information.
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
G06F 18/211 - Selection of the most significant subset of features
G06F 18/2415 - Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
Devices, apparatuses, and systems for thermal management in networking and computing systems are provided. An example in-rack thermal management system includes a cooling distribution unit (CDU) that includes a housing that defines a fluid inlet and a fluid outlet, thermal management components supported by the housing, and direct mechanical connections coupled with the fluid inlet and the fluid outlet. The in-rack thermal management system includes a fluid distribution system that includes a primary fluid channel directly coupled with the direct mechanical connection of the fluid inlet, and a secondary fluid channel directly coupled with the direct mechanical connection of the fluid outlet. In operation, the thermal management components dissipate heat of a fluid received by the CDU via the fluid inlet. The direct mechanical connections directly interface with the fluid distribution system to provide fluid communication between the CDU and the fluid distribution system to maximize dimensions of the housing.
Apparatuses, systems, and techniques to use one or more neural networks to cause a prediction of a quality of one or more wireless signals to be transmitted, wherein the prediction is based, at least in part, on one or more reference signals. In at least one embodiment, a measurement report is to be generated by one or more neural networks.
Apparatuses, systems, and techniques to parse textual data using parallel computing devices. In at least one embodiment, text is parsed by a plurality of parallel processing units using a finite state machine and logical stack to convert the text to a tree data structure. Data is extracted from the tree by the plurality of parallel processors and stored.
A computer-implemented method of controlling power consumption in a multi-processor computing device comprises: determining whether a first processor is operating in a high-power regime or a low-power regime; selecting a first set of control rules that includes a first subset of control rules that apply when the first processor is operating in the high-power regime and a second subset of control rules that apply when the first processor is operating in the low-power regime; determining one or more power settings for the first processor based on the first set of control rules; and causing the first processor to perform one or more operations based on the one or more power settings.
In various examples, a technique for verifying data integrity is disclosed that includes receiving a request to access a data block of a plurality of data blocks stored in a non-secure memory. The technique further includes identifying, in a secure memory, an authentication token associated with the data block. The technique also includes generating an updated authentication token based on the data block. The technique further includes determining whether the updated authentication token corresponds to the identified authentication token stored in the secure memory. The technique still further includes in response to determining that the updated authentication token corresponds to the identified authentication token stored in the secure memory, performing one or more operations using the data block.
H04L 9/32 - Arrangements for secret or secure communicationsNetwork security protocols including means for verifying the identity or authority of a user of the system
In various examples, synthesizing speech in multiple languages in conversational AI systems and applications is described herein. Systems and methods are disclosed that use one or more models to synthesize speech from a first language spoken by a speaker to a second, target language selected by the speaker. In some examples, to perform the translation, the model(s) may disentangle one or more attributes associated with speech from speakers, such as speakers' identities, speakers' accents, and text associated with the speech. Additionally, the model(s) may allow for fine-grained control of additional attributes associated with output speech, such as one or more frequencies, one or more energies, and one or more phoneme durations. Furthermore, the model(s) may be configured to use the accent associated with the target language when generating text, such as when aligning text encodings with one or more phonemes.
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 13/10 - Prosody rules derived from textStress or intonation
G10L 17/02 - Preprocessing operations, e.g. segment selectionPattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal componentsFeature selection or extraction
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
45.
SCHEDULING AND INITIALIZING SOFTWARE EXECUTION SCHEDULES
Embodiments of the present disclosure relate to a system and method used to schedule and initiate execution of one or more tasks. The system may include processing units that may perform operations that may include obtaining execution information that may correspond to a first task and a second task. In some embodiments, the first task may include first operations and where the second task may include second operations. In some embodiments, the operations may further include determining a time to initialize execution of the first task and the second task based at least on the execution information. In some embodiments, execution of the first task may be executed on a first computing system and the second task may be executed on a second computing system where the execution of the first operations and the second operations is interdependent.
In various examples, landmark identification and retargeting for AI systems and applications is described herein. Systems and methods are disclosed that use a first three-dimensional (3D) face (e.g., a morphable model mesh) that is already associated with locations of facial landmarks to determine locations of corresponding facial landmarks on a second 3D face (e.g., a target face mesh). To determine the locations, one or more iterations of transformation processes and/or fitting processes may be performed on the first 3D face in order to morph the landmarks of the first 3D face to align with second landmarks on the second 3D face. After performing the iteration(s) of the transformation processes and/or the fitting processes, closest locations (e.g., vertices) on the second 3D face from the landmark locations (e.g., vertices) on the first 3D face are identified and used as the locations of the corresponding facial landmarks on the second 3D face.
In various examples, infrared image data (e.g., frames of an infrared video feed) may be colorized by applying the infrared image data and/or a corresponding edge map to a generator of a generative adversarial network (GAN). The GAN may be trained with or without paired ground truth RGB and infrared (and/or edge map) images. In an example of the latter scenario, a first generator G(IR)→RGB and a second generator G(RGB)→IR may be trained in a first chain, their positions may be swapped in a second chain, and the second chain may be trained. In some embodiments, edges may be emphasized by weighting edge pixels (e.g., determined from a corresponding edge map) higher than non-edge pixels when backpropagating loss. After training, G(IR)→RGB may be used to generate RGB image data from infrared image data (and/or a corresponding edge map).
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/59 - Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
48.
MINIMIZING SPACE CHARGE FOR OPTICAL-ELECTRICAL DATA TRANSMISSIONS
To improve an optical signal to electrical signal of a photodiode (PD) which is part of an integrated circuit, the PD can be modified to reduce noise and improve the gain bandwidth. In some aspects, the absorption region of the PD can utilize a non-rectangular geometry, for example, a clipped tapered geometry which can absorb the optical signal in more linearly than a rectangular geometry. In some aspects, the input optical signal can be split into two or more split optical signals, where each split optical signal is directed toward a different portion of the absorption region. The incident power of the optical signal transmitted to each respective portion of the absorption region can be reduced by dividing the incident power by the number of split optical signals thereby improving the gain and bandwidth saturation of each portion of the absorption region.
H01L 31/0352 - SEMICONDUCTOR DEVICES NOT COVERED BY CLASS - Details thereof characterised by their semiconductor bodies characterised by their shape or by the shapes, relative sizes or disposition of the semiconductor regions
G02B 6/12 - Light guidesStructural details of arrangements comprising light guides and other optical elements, e.g. couplings of the optical waveguide type of the integrated circuit kind
G02B 6/122 - Basic optical elements, e.g. light-guiding paths
H01L 31/0232 - Optical elements or arrangements associated with the device
H01L 31/028 - Inorganic materials including, apart from doping material or other impurities, only elements of Group IV of the Periodic System
Systems, computer program products, and methods are described for secure data transmission. An example system includes a first end-point device, an intermediate device, and a second-end point device. The first end-point device determines the format requirements of the communication link between the first end-point device and the intermediate device, and the communication link intermediate device and the second end-point device. Based on the format requirements, the first end-point device configures the data packet for transmission, such that the data packet, when received at the intermediate device, is re-configured and routed to the second end-point device. When the second end-point device receives the data packet, it verifies the data packet to confirm that the packet has maintained its integrity throughout transit.
Apparatuses, systems, and techniques to perform neural networks. In at least one embodiment, one or more neural networks are used to predict one or more characteristics of one or more first circuits based, at least in part, on one or more characteristics of one or more second circuits.
Apparatuses, systems, and techniques to generate a trusted execution environment including multiple accelerators. In at least one embodiment, a parallel processing unit (PPU), such as a graphics processing unit (GPU), operates in a secure execution mode including a protect memory region. Furthermore, in an embodiment, a cryptographic key is utilized to protect data during transmission between the accelerators.
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
G06F 9/455 - EmulationInterpretationSoftware simulation, e.g. virtualisation or emulation of application or operating system execution engines
G06F 21/10 - Protecting distributed programs or content, e.g. vending or licensing of copyrighted material
G06F 21/54 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by adding security routines or objects to programs
G06F 21/79 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data in semiconductor storage media, e.g. directly-addressable memories
52.
SENSOR CALIBRATION USING A DYNAMIC PATTERN GENERATOR FOR IN-CABIN MONITORING SYSTEMS AND APPLICATIONS
In various examples, one or more interior or occupant monitoring sensors may be calibrated using one or more display units (e.g., a projector, LED panel, laser robot, heads-up display) that can actively project or display a unique visual pattern and change one or more attributes of the patterns (e.g., shape, color, brightness, size, frame rate, perspective, etc.) without moving the one or more display units. The present techniques may be utilized to iteratively calibrate a sensor using a dynamic visual pattern with one or more visual attributes that vary from iteration to iteration, and/or to interleave different visual patterns in a common region of an overlapping field of view shared by multiple sensors.
Embodiments of the present disclosure relate to a system and method used to transfer image data via Ethernet. In some embodiments, the method may include determining, using a machine learning model, an estimated velocity corresponding to an object based at least on measured RADAR data, where the measured RADAR data may correspond to RADAR detections associated with the object. In some embodiments, the method may further include determining expected RADAR data corresponding to the object based at least on the estimated velocity. Some embodiments may additionally include updating one or more parameters of the machine learning model based on the difference between the measured RADAR data and the expected RADAR data.
G01S 7/41 - Details of systems according to groups , , of systems according to group using analysis of echo signal for target characterisationTarget signatureTarget cross-section
G01S 13/58 - Velocity or trajectory determination systemsSense-of-movement determination systems
G01S 13/931 - Radar or analogous systems, specially adapted for specific applications for anti-collision purposes of land vehicles
54.
PROTECTING CONTROLLER AREA NETWORK (CAN) MESSAGES IN AUTONOMOUS SYSTEMS AND APPLICATIONS
In various examples, a technique for securely transmitting CAN (Controller Area Network) messages is disclosed that includes receiving, using a cryptographic engine, a message from an application to be transmitted over a CAN (Controller Area Network) bus, wherein the cryptographic engine executes a secure firmware and is implemented on an on-die discrete processor. The technique further includes accessing, using the secure firmware, a key from a plurality of keys associated with an authentication process from a secure memory associated with the cryptographic engine. Additionally, the technique includes computing an authentication tag using the key and the message and transmitting the message with the authentication tag over the CAN bus to a destination address.
Various embodiments are directed towards techniques for automatically generating standard cell layouts. In various embodiments, those techniques include processing a netlist graph to generate a plurality of graph embeddings, processing the plurality graph embedding via a transformer model to generate a plurality of device component embeddings, generating a page rank value for each device included in the netlist graph based on the plurality of device component embeddings, performing one or more clustering operations on the page rank values to generate a plurality of device clusters, and performing one or more standard cell synthesis operations using labels for the plurality of device clusters to generate at least one standard cell layout for the netlist graph.
The disclosure provides a solution for intelligent machines, such as autonomous vehicles (AVs), that incorporate operating laws into motion planning by expressing complex operating laws along with other planning criteria under a single framework referred to herein as universal planning criteria (UPC). The UPC is expressed using a rule hierarchy approach, which allows for expressing individual operating rules, such as traffic rules, as signal temporal logic (STL) rules that are assigned an importance order. The STL rule hierarchy is advantageously equipped with a scalar rank-preserving reward function that can be differentiable and can be explicitly used for motion planning and/or embedding in neural motion planners. In one aspect the disclosure provides a method of operating an AV that includes: (1) scalably expressing traffic laws and additional planning criteria in a UPC framework, and (2) generating, using a neural motion planner and the UPC framework, a planned trajectory for the AV.
Systems and methods are disclosed that train a content frame-motion latent diffusion model (CDM) and use the CDM to generate requested videos. The CMD may be a two-stage framework that first compresses videos to a succinct latent space and then learns the video distribution in this latent space. For instance, the CMD may include an autoencoder and two diffusion models. In a first stage, using the autoencoder, a low-dimensional latent decomposition into a content frame and latent motion representation is learned. In the second stage, without adding any new parameters, the content frame distribution may be fine-tuned by using a pretrained image diffusion model, which allows the CMD to leverage the rich visual knowledge in pretrained image diffusion models. In addition, a new lightweight diffusion model may be used to generate motion latent representations that are conditioned on the given content frame.
Systems and methods are disclosed that relate to synthesizing high-resolution 3D geometry and strictly view-consistent images that maintain image quality without relying on post-processing super resolution. For instance, embodiments of the present disclosure describe techniques, systems, and/or methods to scale neural volume rendering to the much higher resolution of native 2D images, thereby resolving fine-grained 3D geometry with unprecedented detail. Embodiments of the present disclosure employ learning-based samplers for accelerating neural rendering for 3D GAN training using up to five times fewer depth samples, which enables embodiments of the present disclosure to explicitly “render every pixel” of the full-resolution image during training and inference without post-processing super-resolution in 2D. Together with learning high-quality surface geometry, embodiments of the present disclosure synthesize high-resolution 3D geometry and strictly view—consistent images while maintaining image quality on par with baselines relying on post-processing super resolution.
Parametric distributions of data are one type of data model that can be used for various purposes such as for computer vision tasks that may include classification, segmentation, 3D reconstruction, etc. These parametric distributions of data may be computed from a given data set, which may be unstructured and/or which may include low-dimensional data. Current solutions for learning parametric distributions of data involve explicitly learning kernel parameters. However, this explicit learning approach is not only inefficient in that it requires a high computational cost (i.e. from a large number of floating point operations per second), but it also leaves room for improvement in terms of accuracy of the resulting learned model. The present disclosure provides a neural network architecture that implicitly learns a parametric distribution of data, which can reduce the computational cost while improve accuracy when compared with prior solutions that rely on the explicit learning design.
Virtual reality and augmented reality bring increasing demand for 3D content creation. In an effort to automate the generation of 3D content, artificial intelligence-based processes have been developed. However, these processes are limited in terms of the quality of their output because they typically involve a model trained on limited 3D data thereby resulting in a model that does not generalize well to unseen objects, or a model trained on 2D data thereby resulting in a model that suffers from poor geometry due to ignorance of 3D information. The present disclosure jointly uses both 2D and 3D data to train a machine learning model to be able to generate 3D content from a single 2D image.
Disclosed are apparatuses, systems, and techniques for deploying and training machine learning models for fast and efficient equalization of signals transmitted over communication channels. In one embodiment, the techniques include processing, using first model(s), a digital representation of a received (RX), via a communication channel, signal to obtain channel loss metrics representative of a difference between the RX signal and a transmitted (TX) signal. The techniques further include obtaining a first set of equalization (EQ) parameter(s), and iteratively obtaining a second set of EQ parameter(s). The techniques further include configuring, using the second set of the EQ parameters, one or more EQ circuits to equalize at least one of the RX signal, the TX signal, or a channel signal.
In various examples, a technique for multiple object tracking is disclosed that includes generating, using one or more processing units, one or more first encoded image features based on a first image. The technique also includes generating a plurality of first object embeddings based on the first encoded image features, wherein at least one first object embedding of the plurality of first object embeddings corresponds to a different object depicted in the first image. The technique further includes determining, using a first track query of one or more track queries and one or more machine learning operations, a first association between a first track and a first object that corresponds to the at least one first object embedding, wherein the first track corresponds to the first track query. The technique further includes computing, based on the first association, an object trajectory associating the first track with the first object.
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
63.
TRAINING, TESTING, AND VERIFYING AUTONOMOUS MACHINES USING SIMULATED ENVIRONMENTS
In various examples, physical sensor data may be generated by a vehicle in a real-world environment. The physical sensor data may be used to train deep neural networks (DNNs). The DNNs may then be tested in a simulated environment—in some examples using hardware configured for installation in a vehicle to execute an autonomous driving software stack—to control a virtual vehicle in the simulated environment or to otherwise test, verify, or validate the outputs of the DNNs. Prior to use by the DNNs, virtual sensor data generated by virtual sensors within the simulated environment may be encoded to a format consistent with the format of the physical sensor data generated by the vehicle.
G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
G06F 9/455 - EmulationInterpretationSoftware simulation, e.g. virtualisation or emulation of application or operating system execution engines
G06F 18/2413 - Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
64.
METANETWORKS FOR PROCESSING NEURAL NETWORKS AS GRAPHS
Embodiments are disclosed for a generating graph representations of neural networks to be used as input for one or more metanetworks. Architectural information can be extracted from a neural network and used to generate graph a representation. A subgraph can be generated for each layer of the neural network, where each subgraph includes nodes that correspond to neurons and connecting edges that correspond to weights. Each layer of the neural network can be associated with a bias node that is connected to individual nodes of that layer using edges representing bias weights. Various types of neural networks and layers of neural networks can be represented by such graphs, which are then used as inputs for metanetworks. The subgraphs can be combined into a comprehensive graph representation of the neural network, which can be provided as input to a metanetwork to generate network parameters or perform another such operation.
Systems and methods of the present disclosure include interactive editing for generated three-dimensional (3D) models, such as those represented by neural radiance fields (NeRFs). A 3D model may be presented to a user in which the user may identify one or more localized regions for editing and/or modification. The localized regions may be selected and a corresponding 3D volume for that region may be provided to one or more generative networks, along with a prompt, to generate new content for the localized regions. Each of the original NeRF and the newly generated NeRF for the new content may then be combined into a single NeRF for a combined 3D representation with the original content and the localized modifications.
In various examples, systems and methods are disclosed relating to generating tokens for traffic modeling. One or more circuits can identify trajectories in a dataset, and generate actions from the identified trajectories. The one or more circuits can generate, based at least on the plurality of actions and at least one trajectory of the plurality of trajectories, a set of tokens representing actions to generate trajectories of one or more agents in a simulation. The one or more circuits may update a transformer model to generate simulated actions for simulated agents based at least on tokens generated from the trajectories in the dataset.
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
67.
DYNAMIC PATH SELECTION FOR PROCESSING THROUGH A MULTI-LAYER NEURAL NETWORK
Performance of a neural network is usually a function of the capacity, or complexity, of the neural network, including the depth of the neural network (i.e. the number of layers in the neural network) and/or the width of the neural network (i.e. the number of hidden channels). However, improving performance of a neural network by simply increasing its capacity has drawbacks, the most notable being the increased computational cost of a higher-capacity neural network. Since modern neural networks are configured such that the same neural network is evaluated regardless of the input, a higher capacity neural network means a higher computational cost incurred per input processed. The present disclosure provides for a multi-layer neural network that allows for dynamic path selection through the neural network when processing an input, which in turn can allow for increased neural network capacity without incurring the typical increased computation cost associated therewith.
Transformers are neural networks that learn context and thus meaning by tracking relationships in sequential data. The main building block of transformers is self-attention which allows for cross interaction among all input sequence tokens with each other. This scheme effectively captures short-and long-range spatial dependencies and imposes time and space quadratic complexity in terms of the input sequence length, which enables their use with Natural Language Processing (NLP) and computer vision tasks. While the training parallelism of transformers allows for competitive performance, unfortunately the inference is slow and expensive due to the computational complexity. The present disclosure provides a computer vision retention model that is configured for both parallel training and recurrent inference, which can enable competitive performance during training and fast and memory-efficient inferences during deployment.
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
Apparatuses, systems, and techniques to train neural networks. In at least one embodiment, a first normalization of learned parameters of one or more learned layers is performed during a forward pass of a training iteration and a second normalization of the learned parameters is performed during a parameter update phase of the training iteration. In at least one embodiment, the first normalization is performed using first scaling factors and the second normalization is performed using second scaling factors.
Apparatuses, systems, and techniques to train neural networks and to use neural networks to perform inference. In at least one embodiment, a balanced concatenation layer performs a balanced concatenation operation during a forward pass of a training iteration during the training of a neural network. In at least one embodiment, a balanced concatenation layer performs a balanced concatenation operation during the use of a neural network to perform inference.
Apparatuses, systems, and techniques to compute neural network parameters and to use a neural network to perform inference. In at least one embodiment, neural network parameters are computed, after training, by determining a weighted average of snapshots of averaged parameters that form a basis set of averaged parameter snapshots, each respective snapshot of averaged parameters including a plurality of network parameters averaged by a respective combination of an averaging function and one or more averaging parameters.
Systems and methods are directed toward evaluating auditory inputs against a range of tolerance to provide feedback regarding pronunciation. An auditory input may be evaluated using a trained machine learning system and evaluated for similarity against a target word. Similarity may be scored and then evaluated to determine whether the similarity falls within a range of tolerance, wherein the range of tolerance may be adjusted or modified for particular uses. A score within the range of tolerance is indicative of a word that has been pronounced such that it would be perceptible.
G10L 15/16 - Speech classification or search using artificial neural networks
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 15/02 - Feature extraction for speech recognitionSelection of recognition unit
73.
IMAGE UPSAMPLING USING ONE OR MORE NEURAL NETWORKS
Apparatuses, systems, and techniques are presented to reconstruct one or more images. In at least one embodiment, one or more neural networks are used to upsample one or more images based, at least in part, on one or more brightness values.
In various examples, visual differences between video data and an encoded version of the video data are used to determine updates to quantization parameters (QPs) used to encode the video data to store or upload video clips that corresponds to notable events associated with a machine. The video data may correspond to images applied to a machine learning model to perform control operations for the machine. A metric may be used to quantify the visual differences. To evaluate the visual differences, samples of the video data and encoded video data may be determined and analyzed, rather than entire images or frames. To selectively enable updates to QPs, the system may detect that a deviation between a bitrate corresponding to the encoded data and a reference bitrate has exceeded a threshold.
Embodiments of the present disclosure relate to performing navigation operations based on junction area information digested at runtime of one or more systems of a machine. In some embodiments, the junction area information may be encoded in map data corresponding to a map, and the encoding may include organizing vehicle paths that traverse through a junction area according to path groups and organizing contentions that influence behavior of vehicles traveling along the vehicle paths according to contention groups. In addition, the encoding may include generating direction data structures that associate respective path groups with one or more of the contention groups. In these or other embodiments, the map data that corresponds to the junction area may be updated with direction data structures.
Systems and methods are directed toward a platform resiliency authority that may be used to identify a component capability to manage one or more update requests using on-component structures. When an update request is received, a component capability level may be determined and, if the component is unable to perform each of a protection, a detection, and a recovery operation on-component, additional update information may be identified and used to cause the update to be installed for the component.
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
Reference voltage generators including a header circuit configured to pass current from a power supply to a time-to-digital converter, an amount of the current to pass determined by a thermometer code, and logic to update the thermometer code based on a comparison between an output of the time-to-digital converter and a digital code representing a reference voltage level.
G05F 1/56 - Regulating voltage or current wherein the variable actually regulated by the final control device is DC using semiconductor devices in series with the load as final control devices
G04F 10/00 - Apparatus for measuring unknown time intervals by electric means
H03M 7/16 - Conversion to or from unit-distance codes, e.g. Gray code, reflected binary code
Power delivery systems for integrated circuits that include a first metal path traversing first metal layers from a global power domain supply to a voltage regulator, a second metal path traversing second metal layers from a local power domain supply to the voltage regulator, and a third metal path traversing third metal layers from the local power domain supply to an integrated circuit. Electrical isolation gaps are formed between the first metal layers, the second metal layers, and the third metal layers.
H02M 3/155 - Conversion of DC power input into DC power output without intermediate conversion into AC by static converters using discharge tubes with control electrode or semiconductor devices with control electrode using devices of a triode or transistor type requiring continuous application of a control signal using semiconductor devices only
H01L 23/528 - Layout of the interconnection structure
79.
LOW LATENCY SYNCHRONIZATION OF VIDEO RENDERING PIPELINES WITH HIGH REFRESH RATES
Disclosed are apparatuses, systems, and techniques that reduce latency of frame processing pipelines. The techniques include but are not limited to causing a display to set a refresh rate that matches a frame rendering rate of an application and rendering, with the frame rendering rate, a plurality of frames. Rendering frames includes generating, using a first processing unit, sets of instructions associated with respective frames and generated starting at times spaced with the refresh rate, processing, using a second processing unit, the sets of instructions to render the frames, and causing the display to display the rendered frames.
Apparatuses, systems, and methods to cause one or more locations of one or more objects within one or more images to be identified based, at least in part, on one or more locations of the one or more objects within one or more previous images.
In various examples, a machine learning model—such as a deep neural network (DNN)—may be trained to use image data and/or other sensor data as inputs to generate two-dimensional or three-dimensional trajectory points in world space, a vehicle orientation, and/or a vehicle state. For example, sensor data that represents orientation, steering information, and/or speed of a vehicle may be collected and used to automatically generate a trajectory for use as ground truth data for training the DNN. Once deployed, the trajectory points, the vehicle orientation, and/or the vehicle state may be used by a control component (e.g., a vehicle controller) for controlling the vehicle through a physical environment. For example, the control component may use these outputs of the DNN to determine a control profile (e.g., steering, decelerating, and/or accelerating) specific to the vehicle for controlling the vehicle through the physical environment.
Techniques applicable to a ray tracing hardware accelerator for traversing a hierarchical acceleration structure with reduced false positive ray intersections are disclosed. The reduction of false positives may be based upon one or more of selectively performing a secondary higher precision intersection test for a bounding volume, identifying and culling bounding volumes that degenerate to a point, and parametrically clipping rays that exceed certain configured distance thresholds.
A hardware-based traversal coprocessor provides acceleration of tree traversal operations searching for intersections between primitives represented in a tree data structure and a ray. The primitives may include opaque and alpha triangles used in generating a virtual scene. The hardware-based traversal coprocessor is configured to determine primitives intersected by the ray, and return intersection information to a streaming multiprocessor for further processing. The hardware-based traversal coprocessor is configured to provide a deterministic result of intersected triangles regardless of the order that the memory subsystem returns triangle range blocks for processing, while opportunistically eliminating alpha intersections that lie further along the length of the ray than closer opaque intersections.
A trajectory for an autonomous machine may be evaluated for safety based at least on determining whether the autonomous machine would be capable of occupying points of the trajectory in space-time while still being able to avoid a potential future collision with one or more objects in the environment through use of one or more safety procedures. To do so, a point of the trajectory may be evaluated for conflict based at least on a comparison between points in space-time that correspond to the autonomous machine executing the safety procedure(s) from the point and arrival times of the one or more objects to corresponding position(s) in the environment. A trajectory may be sampled and evaluated for conflicts at various points throughout the trajectory. Based on results of one or more evaluations, the trajectory may be scored, eliminated from consideration, or otherwise considered for control of the autonomous machine.
Apparatuses, systems, and techniques for a compiled shader program caches in a cloud computing environment. A set of compiled shader programs associated with an instance of an application hosted by an application hosting platform is received. The set of compiled shader programs are included in an shader cache associated with the application, the shader cache hosted by the application hosting platform. A detection is made that a shader program is referenced during a runtime of the instance of the application. Responsive to a determination that a compiled version of the referenced shader program is not included in the received set of compiled shader programs, the shader program is compiled to generate the compiled version of the shader program. A request is transmitted to the application hosting platform to modify the shader cache in view of the compiled version of the shader program.
Apparatuses, systems, and techniques to process image frames. In at least one embodiment, one or more intermediate video frames are generated between a first video frame and a second video frame. In at least one embodiment, the one or more intermediate video frames are generated based, at least in part, on depth information of one or more pixels of the first video frame or second video frame.
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
G06F 13/28 - Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access, cycle steal
88.
QUERY-SPECIFIC BEHAVIORAL MODIFICATION OF TREE TRAVERSAL
Methods and systems are described in some examples for changing the traversal of an acceleration data structure in a highly dynamic query-specific manner, with each query specifying test parameters, a test opcode and a mapping of test results to actions. In an example ray tracing implementation, traversal of a bounding volume hierarchy by a ray is performed with the default behavior of the traversal being changed in accordance with results of a test performed using the test opcode and test parameters specified in the ray data structure and another test parameter specified in a node of the bounding volume hierarchy. In an example implementation a traversal coprocessor is configured to perform the traversal of the bounding volume hierarchy.
One embodiment of the present invention sets forth a technique for performing meta-learning. The technique includes performing a first set of training iterations to convert a prediction learning network into a first trained prediction learning network based on a first support set of training data and executing a representation learning network and the first trained prediction learning network to generate a first set of supervised training output and a first set of self-supervised training output based on a first query set of training data corresponding to the first support set of training data. The technique also includes performing a first training iteration to convert the representation learning network into a first trained representation learning network based on a first loss associated with the first set of supervised training output and a second loss associated with the first set of self-supervised training output.
An optical apparatus, with an optical interconnect, the optical interconnect including a first optical transceiver having a first notch filter, the first notch filter including first and second optical add drop multiplexer demultiplexers connected to receive a continuous wave light beam and send a first and second filtered wavelengths to first and second resonant modulators which send first and send modulated optical signals through a light propagation path. The second filtered wavelength is different from the first filtered wavelength, and the second modulated optical signal has a polarity that is orthogonal to a polarity of the first modulated optical signal. Methods of communicating using the apparatus and an optical filter for use in an optical transceiver are also disclosed.
H04B 10/80 - Optical aspects relating to the use of optical transmission for specific applications, not provided for in groups , e.g. optical power feeding or optical transmission through water
H04J 14/02 - Wavelength-division multiplex systems
Apparatuses, systems, and techniques to generate an image using a neural network based model using a variable error threshold. In at least one embodiment, one or more neural networks are used to generate a final output image by iteratively removing noise from an initial image based, at least in part, on one or more variable error threshold values.
One embodiment of a method for determining object poses includes receiving first sensor data and second sensor data, where the first sensor data is associated with a first modality, and the second sensor data is associated with a second modality that is different from the first modality, and performing one or more iterative operations to determine a pose of an object based on one or more comparisons of (i) one or more renderings of a three-dimensional (3D) representation of the object in the first modality with the first sensor data, and (ii) one or more renderings of the 3D representation of the object in the second modality with the second sensor data.
Embodiments of the present disclosure relate to neural components for differentiable ray tracing of radio propagation. Differentiable ray tracing may be used to refine the scene geometry of the physical environment, to learn or optimize the scene properties of objects in the scene, to learn or optimize the scene properties of antennas, and to learn or optimize antenna patterns, array geometries, and orientations and positions of transmitters and receivers. Once scene properties have been learned or optimized, the differentiable ray tracer may further be used to simulate the performance of different configurations of the transmitters, receivers, and scene geometry. In an embodiment, one or more of the scene geometry, scene properties, and antenna characteristics are computed by a differentiable parametric function, such as a neural network, etc. and parameters of the differentiable parametric function are learned using the differentiable ray tracing.
The performance of a neural network is improved by applying quantization to data at various points in the network. In an embodiment, a neural network includes two paths. A quantization is applied to each path, such that when an output from each path is combined, further quantization is not required. In an embodiment, the neural network is an autoencoder that includes at least one skip connection. In an embodiment, the system determines a set of quantization parameters based on the characteristics of the data in the primary path and in the skip connection, such that both network paths produce output data in the same fixed point format. As a result, the data from both network paths can be combined without requiring an additional quantization.
G05D 1/00 - Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
G05D 1/227 - Handing over between remote control and on-board controlHanding over between remote control arrangements
G05D 1/249 - Arrangements for determining position or orientation using signals provided by artificial sources external to the vehicle, e.g. navigation beacons from positioning sensors located off-board the vehicle, e.g. from cameras
G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/043 - Architecture, e.g. interconnection topology based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS]
Diffusion models are machine learning algorithms that are uniquely trained to generate high-quality data from an input lower-quality data. Diffusion probabilistic models use discrete-time random processes or continuous-time stochastic differential equations (SDEs) that learn to gradually remove the noise added to the data points. With diffusion probabilistic models, high quality output currently requires sampling from a large diffusion probabilistic model which corners at a high computational cost. The present disclosure stitches together the trajectory of two or more inferior diffusion probabilistic models during a denoising process, which can in turn accelerate the denoising process by avoiding use of only a single large diffusion probabilistic model.
Apparatuses, systems, and techniques to generate an image of an environment. In at least one embodiment, one or more neural networks are used to identify one or more static and dynamic features of an environment to be used to generate a representation of the environment.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
H04N 13/279 - Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
97.
SPEECH-TO-TEXT PROCESSING ASSISTED WITH LANGUAGE MODELS FOR CONVERSATIONAL AI SYSTEMS AND APPLICATIONS
Disclosed are apparatuses, systems, and techniques that implement training and deployment of speech-augmented language models for efficient capturing and processing of speech inputs. The techniques include processing, using a speech model, an audio input in a first language to generate a first portion of an input into a language model (LM). A second portion of the input into the LM includes represents a text context associated with the audio input. The techniques further include receiving, from the LM, an output that includes a speech-to-text conversion of the audio input.
Apparatuses, systems, and techniques to construct a neural network architecture. In at least one embodiment, candidate neural components may be selected for the neural network by jointly updating performance metric masks attached to these candidate neural components and a union neural network comprising all candidate neural components.
In various examples, a trajectory prediction model provides interpretable trajectory predictions for autonomous and semi-autonomous systems and applications via counterfactual game-theoretic reasoning. Model-based latent variables can be formulated through responsibility evaluations. Responsibility can be broken into multiple components, such as safety and courtesy. Responsibility can be quantified, for example, by answering a counterfactual question: could an agent have executed differently to respect other agents' safety and be more courteous to others' plans? The framework can be used to abstract computed responsibility sequences into different responsibility levels and ground latent levels into a trajectory prediction model able to render interpretable and accurate inferences about trajectory.
One embodiment of the present invention sets forth a technique for executing a transformer neural network. The technique includes executing a first attention unit included in the transformer neural network to convert a first input token into a first query, a first key, and a first plurality of values, where each value included in the first plurality of values represents a sub-task associated with the transformer neural network. The technique also includes computing a first plurality of outputs associated with the first input token based on the first query, the first key, and the first plurality of values. The technique further includes performing a task associated with an input corresponding to the first input token based on the first input token and the first plurality of outputs.