Systems and methods relate to the determination of accurate motion vectors, for rendering situations such as a noisy Monte Carlo integration where image object surfaces are at least partially translucent. To optimize the search for “real world” positions, this invention defines the background as first path vertices visible through multiple layers of refractive interfaces. To find matching world positions, the background is treated as a single layer morphing in a chaotic way, permitting the optimized algorithm to be executed only once. Further improving performance over the prior linear gradient descent, the present techniques can apply a cross function and numerical optimization, such as Newton's quadratic target or other convergence function, to locate pixels via a vector angle minimization. Determined motion vectors can then serve as input for services including image denoising.
In various examples, methods and systems are provided for estimating depth values for images (e.g., from a monocular sequence). Disclosed approaches may define a search space of potential pixel matches between two images using one or more depth hypothesis planes based at least on a camera pose associated with one or more cameras used to generate the images. A machine learning model(s) may use this search space to predict likelihoods of correspondence between one or more pixels in the images. The predicted likelihoods may be used to compute depth values for one or more of the images. The predicted depth values may be transmitted and used by a machine to perform one or more operations.
G06T 7/55 - Depth or shape recovery from multiple images
G06T 7/70 - Determining position or orientation of objects or cameras
G06V 10/46 - Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]Salient regional features
A capacitive monitoring structure includes a ring oscillator and dynamically configurable capacitive load circuits coupled between stages of the ring oscillator. The oscillation frequency of the ring oscillator changes in response to settings applied to the capacitive load circuits to change a capacitance applied to the ring oscillator by the capacitive load circuits, where the capacitance may be one of a transistor drain capacitance, a transistor gate capacitance, and a Miller capacitance.
G01R 27/26 - Measuring inductance or capacitanceMeasuring quality factor, e.g. by using the resonance methodMeasuring loss factorMeasuring dielectric constants
H01L 27/092 - Devices consisting of a plurality of semiconductor or other solid-state components formed in or on a common substrate including integrated passive circuit elements with at least one potential-jump barrier or surface barrier the substrate being a semiconductor body including only semiconductor components of a single kind including field-effect components only the components being field-effect transistors with insulated gate complementary MIS field-effect transistors
4.
NEURAL NETWORK BASED DETERMINATION OF GAZE DIRECTION USING SPATIAL MODELS
Systems and methods for determining the gaze direction of a subject and projecting this gaze direction onto specific regions of an arbitrary three-dimensional geometry. In an exemplary embodiment, gaze direction may be determined by a regression-based machine learning model. The determined gaze direction is then projected onto a three-dimensional map or set of surfaces that may represent any desired object or system. Maps may represent any three-dimensional layout or geometry, whether actual or virtual. Gaze vectors can thus be used to determine the object of gaze within any environment. Systems can also readily and efficiently adapt for use in different environments by retrieving a different set of surfaces or regions for each environment.
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
G06V 20/59 - Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
Approaches presented herein provide systems and methods for dynamic allocation of processing units to increase computational density. Idle times between sequential processing tasks may be computed and, if the idle time exceeds a threshold capacity, additional sequential processing tasks may be allocated to a common processing unit. As a request, a first portion of a first sequential processing task may be executed, then a second portion of a second sequential processing task may be executed prior to executing a subsequent portion of the second sequential processing task. By using the idle time between portions of sequential processing tasks, output perform may be maintained while using additional processing capabilities that would otherwise remain idle.
Approaches presented herein provide systems and methods to selectively apply packet loss mitigation methods to one or more regions of a frame that have a sufficient importance value. The importance value may be determined by a saliency map generated for the frame that determine the most important content elements or regions of the frame. An importance value may be computed based on the saliency map and then, for areas of sufficient importance, selective mitigation methods may be used to reduce bandwidth, conserve compute resources, and provide error correction or duplication for important regions of the frame.
H04N 21/238 - Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidthProcessing of multiplex streams
H04N 21/234 - Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
H04N 21/24 - Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth or upstream requests
7.
SPEED DETERMINATION IN ROBOTICS SYSTEMS AND APPLICATIONS
In various examples, a technique for generating speed change decisions for a mobile robot includes identifying, using one or more maps of a physical environment, one or more obstacles associated with one or more portions of a path of the mobile robot in the physical environment. The technique also includes generating, based at least on the one or more obstacles, one or more speed constraints, each speed constraint specifying a speed limit for a respective portion of the path. The technique further includes generating one or more speed change decisions specifying actions to be performed by the mobile robot to cause a speed profile of the mobile robot to satisfy the one or more speed constraints.
Approaches presented herein provide for the matching and alignment of features in different instances of sensor data corresponding to an environment. At least one embodiment provides for accurate identification of matching lane dividers between two or more tracks obtained from sensor-equipped vehicles or machines. An initial transform can be determined using a seed area for tracks of data, where the seed area can be determined using landmarks, lane boundaries, or other such objects identified from the sensor data. The initial transform can be used to determine lane divider matches in the track data. If successfully evaluated, these lane divider matches from the seed areas can be propagated out in one or more tracking directions along a roadway to determine lane divider matches along entire stretches of roadway, including roads that pass through intersections or other relatively complex regions.
Disclosed are apparatuses, systems, and techniques for implementing bidirectional tracking in computer vision applications. In one embodiment, the techniques include obtaining digital representations of an object depicted in video frames and for each of a forward direction (FD) of tracking and a reverse direction (RD) of tracking, obtaining, using (i) a current state of the object associated with an upstream video frame and (ii) the digital representation of the object for a downstream video frame, an updated state of the object associated with the downstream video frame. The techniques further include obtaining, using the updated state of the object for the FD and/or the updated state of the object for the RD, a bidirectional state of the object, and determining, using the bidirectional state of the object, a trajectory of the object across the video frames.
G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
10.
APPLICATION PROGRAMMING INTERFACE TO INDICATE MEMORY ACCESS
Apparatuses, systems, and techniques to perform a neural network to perform an API to cause storage to be reserved. In at least one embodiment, for example, an API causes storage to be reserved based, at least in part, on a flag indicating a memory pool to be allocated to a memory of a processor, such as a GPU or CPU. In at least one embodiment, as another example, a processor comprising one or more circuits performs an application programming interface (API) to indicate whether one or more graphics processing units (GPU) are to access GPU storage or host storage.
Denoising images rendered using Monte Carlo sampled ray tracing is an important technique for improving the image quality when low sample counts are used. Ray traced scenes that include volumes in addition to surface geometry are more complex, and noisy when low sample counts are used to render in real-time. Joint neural denoising of surfaces and volumes enables combined volume and surface denoising in real time from low sample count renderings. At least one rendered image is decomposed into volume and surface layers, leveraging spatio-temporal neural denoisers for both the surface and volume components. The individual denoised surface and volume components are composited using learned weights and denoised transmittance. A surface and volume denoiser architecture outperforms current denoisers in scenes containing both surfaces and volumes, and produces temporally stable results at interactive rates.
Enhanced techniques applicable to a ray tracing hardware accelerator for traversing a hierarchical acceleration structure and its underlying primitives are disclosed. For example, traversal speed is improved by grouping processing of primitives sharing at least one figure (e.g., a vertex or an edge) during ray-primitive intersection testing. Grouping the primitives for ray intersection testing can reduce processing (e.g., projections and transformations of primitive vertices and/or determining edge function values) because at least a portion of the processing results related to the shared feature in one primitive can be used to determine whether the ray intersects another primitive(s). Processing triangles sharing an edge can double the culling rate of the triangles in the ray/triangle intersection test without replicating the hardware.
Apparatuses, systems, and techniques to scale processor clocks. In at least one embodiment, one or more circuits are to scale one or more clocks of one or more cores based, at least in part, on a proximity of the one or more cores to each other.
Approaches presented herein can provide for the performance of specific types of tasks using a large model, without a need to retrain the model. Custom endpoints can be trained for specific types of tasks, as may be indicated by the specification of one or more guidance mechanisms. A guidance mechanism can be added to or used along with a request to guide the model in performing a type of task with respect to a string of text. An endpoint receiving such a request can perform any marshalling needed to get the request in a format required by the model, and can add the guidance mechanisms to the request by, for example, prepending one or more text strings (or text prefixes) to a text-formatted request. A model receiving this string can process the text according to the guidance mechanisms. Such an approach can allow for a variety of tasks to be performed by a single model.
Approaches presented herein can provide for the performance of specific types of tasks using a large model, without a need to retrain the model. Custom endpoints can be trained for specific types of tasks, as may be indicated by the specification of one or more guidance mechanisms. A guidance mechanism can be added to or used along with a request to guide the model in performing a type of task with respect to a string of text. An endpoint receiving such a request can perform any marshalling needed to get the request in a format required by the model, and can add the guidance mechanisms to the request by, for example, prepending one or more text strings (or text prefixes) to a text-formatted request. A model receiving this string can process the text according to the guidance mechanisms. Such an approach can allow for a variety of tasks to be performed by a single model.
In various examples, a technique for generating a path between a current location and a target waypoint is disclosed that includes receiving a route plan that is associated with a plurality of waypoints representing locations in a physical environment. The technique also includes identifying a search space that includes the route plan, and identifying a target waypoint of the plurality of waypoints—the target waypoint being in a portion of the search space between a current location of a mobile robot and an end waypoint of the route plan. A path between the current location of the mobile robot and the target waypoint may then be generated.
Apparatuses, systems, and techniques of autonomously adjust priority of information to be transmitted by a UE device. In at least one embodiment, a UE device autonomously adjusts priority of information to be transmitted by adjusting prioritization parameters. In at least one embodiment, prioritization parameters are adjusted autonomously by a UE device that monitors packet statistics.
Apparatuses, systems, and techniques to cause one or more neural networks to summarize a text. In at least one embodiment, a processor is to cause one or more neural networks to generate one or more summaries of a first portion of a text based, at least in part, on one or more second portions of said text.
In various examples, systems and methods are disclosed relating to health and error monitoring of sensor fusion systems. Systems and methods are disclosed that aggregate results of monitoring and error checking in a sensor fusion system in a single checkpoint. A processor may include one or more circuits. The one or more circuits may receive perception data from one or more first sensors of a machine. The one or more circuits may receive position data from one or more second sensors of the machine. The one or more circuits may generate output data by performing fusion of at least the perception data and the position data. The one or more circuits may evaluate a plurality of criteria according to at least a subset of the perception data, the position data, and the output data. The one or more circuits may output an error signal according to the evaluation.
In various examples, each hosted application may be modeled with a corresponding application-specific resource consumption model that predicts a measure of that application's anticipated resource utilization at some future time based on an input representation of one or more features of the current state of an instance of the hosted application. For cloud gaming, those features may include the current level being played, current obstacles, user results playing the level or obstacles, metadata quantifying one or more aspects of the level or obstacles, game progress, etc. As such, application-specific models may be used to predict resource demands at a future time and schedule resource allocations accordingly. The present techniques may be used to manage and reallocate resources for applications such as game streaming applications, remote desktop applications, simulation applications (e.g., an autonomous or semi-autonomous vehicle simulation), virtual reality (VR) and/or augmented reality (AR) streaming applications, and/or other application types.
Apparatuses, systems, and techniques are presented to reduce noise in audio. In at least one embodiment, a sequence of neural networks is used to remove foreground and background noise from audio including a primary audio signal.
G10L 21/00 - Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/84 - Detection of presence or absence of voice signals for discriminating voice from noise
22.
Fully cache coherent virtual partitions in multitenant configurations in a multiprocessor system
Various embodiments include techniques for processing memory operations in a computing system. The computing system includes a central processing unit (CPU) and an auxiliary processor, such as a parallel processing unit (PPU). The PPU can be divided into multiple partitions. Although the partitions are included in a single PPU, the CPU can track the partitions as if the partitions are independent devices rather than different portions of a single device. When two different partitions generate memory operations that access the same memory address in CPU memory address space, the two partitions employ two different data paths. The CPU can use path information for the two different paths to identify which partition generated each memory operation. As a result, the CPU can maintain data consistency and memory coherency in a system where a PPU is divided into multiple partitions.
Approaches presented herein provide systems and methods for reading a portion of data from an unprocessed memory segment to a registry and changing a status identifier for the unprocessed memory segment indicative of the data being read to the registry. The data may then be written to a destination memory segment from the registry. If a corresponding status identifier for the destination memory segments meets a value indicative that the destination memory segment has not yet been read into the registry, it may be read into the registry before overwriting it, and recursively written into its own destination memory segment.
Various embodiments include techniques for performing memory synchronization operations between processors in a multiprocessor computing system. A first processor transfers data by issuing memory operations to store the data to a shared memory. The first processor issues an asynchronous release operation to a load store unit. In response, the load store unit issues a memory synchronization operation to ensure that the data associated with the memory operations is visible in the shared memory. While the asynchronous release operation is pending, the first processor is able to issue further instructions and perform other operations. When the data associated with the memory operations is visible in the shared memory, the memory synchronization operation completes and the load store unit writes a flag to a separate memory location. Upon detecting that the flag has been written, a second thread, and/or other threads, can reliably read the data stored in the shared memory.
Approaches presented herein provide for the acceleration of a human review process, such as the review of annotations generated by a human labeler. Annotations (at least partially) generated by a human reviewer can be provided as input to a machine learning model trained to infer a probability of the annotations including at least one error. Annotations with a low probability of including an error can be approved automatically, while annotations with a high probability (e.g., above a threshold) of including an error can be directed for human review. In order to keep the human reviewer engaged, artificial errors may be introduced at various times based on various engagement criteria.
Devices, systems, and techniques to incorporate lighting effects into computer-generated graphics. In at least one embodiment, a virtual scene comprising a plurality of lights is rendered by subdividing the virtual area and stored, in a record corresponding to a subdivision of the virtual area, information indicative of one or more lights in the virtual area selected based on a stochastic model. Pixels near a subdivision are rendered based on the light information stored in the subdivision.
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a liquid-to-air heat exchanger is associated with a fan wall and a refrigerant-based cooling system to provide air cooling and refrigerant-based cooling to cool secondary coolant or fluid received from at least one cold plate.
A ray (e.g., a traced path of light, etc.) is generated from an originating pixel within a scene being rendered. Additionally, one or more shadow map lookups are performed for the originating pixel to estimate an intersection of the ray with alpha-tested geometry within the scene. A shadow map stores the distance of geometry as seen from the point of view of the light, and alpha-tested geometry includes objects within the scene being rendered that have a determined texture and opacity. Further, the one or more shadow map lookups are performed to determine a visibility value for the pixel (e.g., that identifies whether the originating pixel is in a shadow) and a distance value for the pixel (e.g., that identifies how far the pixel is from the light). Further still, the visibility value and the distance value for the pixel are passed to a denoiser.
Disclosed are apparatuses, systems, and techniques that use label-looping processing for efficient automatic speech recognition (ASR). The techniques include performing a plurality of iterations of an outer processing loop to identify content units (CUs) of a media item having multiple frames. An individual iteration of the outer processing loop includes updating, using a first neural network (NN) and identified non-blank CU, a state of the media item and performing one or more iterations of an inner processing loop. An individual iteration of the inner processing loop includes processing, using a second NN, the state of the media item and an individual frame to predict a CU associated with the individual frame. The iterations of the inner processing loop are performed until the predicted CU corresponds to a non-blank CU. The identified plurality of CUs is used to generate a representation of the media item.
In various embodiments, an inductor package comprises a first inductor that is arranged in a first orientation and produces a first magnetic flux in a first direction; and a second inductor that is arranged in a second orientation and produces a second magnetic flux in a second direction that at least partially cancels the first magnetic flux, where the first direction is opposite the second direction. In some embodiments, a printed circuit board assembly comprises a printed circuit board (PCB) layer, a first inductor that is arranged on the PCB layer at a first orientation and produces a first magnetic flux in a first direction, and a second inductor that is arranged on the PCB layer at a second orientation and produces a second magnetic flux in a second direction that at least partially cancels the first magnetic flux.
In various examples, one or more output channels of a deep neural network (DNN) may be used to determine assignments of obstacles to paths. To increase the accuracy of the DNN, the input to the DNN may include an input image, one or more representations of path locations, and/or one or more representations of obstacle locations. The system may thus repurpose previously computed information—e.g., obstacle locations, path locations, etc.—from other operations of the system, and use them to generate more detailed inputs for the DNN to increase accuracy of the obstacle to path assignments. Once the output channels are computed using the DNN, computed bounding shapes for the objects may be compared to the outputs to determine the path assignments for each object.
Circuits that include one or more transmission lines to propagate a signal through a serially-arranged plurality of repeaters, and one or more control circuits to propagate control pulses to the repeaters, wherein a timing and duration of the control pulses is configured to operate the repeaters in current-mode signaling (CMS) mode during a state transition of the signal at the repeaters and to operate the repeaters in voltage-mode signaling (VMS) mode otherwise.
Apparatuses, systems, and techniques are presented to train and utilize one or more neural networks. A denoising diffusion generative adversarial network (denoising diffusion GAN) reduces a number of denoising steps during a reverse process. The denoising diffusion GAN does not assume a Gaussian distribution for large steps of the denoising process and applies a multi-model model to permit denoising with fewer steps. Systems and methods further minimize a divergence between a diffused real data distribution and a diffused generator distribution over several timesteps. Accordingly, various embodiments may enable faster sample generation, in which the samples are generated from noise using the denoising diffusion GAN.
The disclosure provides a voltage detecting circuit that detects voltage increases and voltage decreases using a diode drop and voltage thresholds. The voltage detecting circuit, referred to as a voltage variation detector, uses the diode to maintain a differential between the voltage being monitored and a voltage threshold. When the diode is reversed bias, the voltage variation detector generates a detecting signal indicating the monitored voltage crossed the voltage threshold. In one example, the method includes: (1) detecting at least one transition of a voltage across a voltage threshold, wherein the detecting is based on a transistor diode being reversed biased, (2) generating a detection signal when the voltage crosses the voltage threshold, and (3) performing one or more actions in response to the detection signal.
G01R 19/165 - Indicating that current or voltage is either above or below a predetermined value or within or outside a predetermined range of values
G01R 19/17 - Indicating that current or voltage is either above or below a predetermined value or within or outside a predetermined range of values giving an indication of the number of times this occurs
H02H 3/20 - Emergency protective circuit arrangements for automatic disconnection directly responsive to an undesired change from normal electric working condition, with or without subsequent reconnection responsive to excess voltage
In various embodiments, an inductor package comprises a first portion of an inductor coil that runs substantially along a first axis and carries an electrical current along the first axis in a first direction to produce a first magnetic flux, and a second portion of the inductor coil that runs substantially along the first axis and carries the electrical current along the first axis in a second direction to produce a second magnetic flux that at least partially cancels the first magnetic flux, where the first direction is opposite the second direction.
Apparatuses, systems, and techniques to generate images of objects. In at least one embodiment, one or more neural networks are trained to identify one or more objects within one or more images, and the one or more neural networks are used to generate an image of one or more objects.
A system and method for generating a digital avatar from a two-dimensional input image in accordance with a machine learning models is provided. The machine learning models are generative adversarial networks trained to process a latent code into three-dimensional data and color data. A generative adversarial network (GAN) inversion optimization algorithm is run on the first machine learning model to map the input image to a latent code for the first machine learning model. The latent code is used to generate unstructured 3D data and color information. A GAN inversion optimization algorithm is then run on the second machine learning model to determine a latent code for the second machine learning model, based at least on the output of the first machine learning model. The latent code for the second machine learning model is then used to generate the data for the digital avatar.
A circuit includes a phase selector to generate an injection clock signal having an injection phase based on a phase of a digitally controlled oscillator clock signal generated within a phase-locking feedback loop. An injection-locked oscillator (ILO), coupled to an output of the phase selector, generates an ILO clock signal that is convertible to provide a feedback clock signal of the circuit. Logic, coupled between an output of the ILO and the phase selector, to, at each predetermined number of cycles of the DCO clock signal, cause the phase selector to output a phase shift in the injection clock signal that causes the ILO clock signal to comprise a rotated phase, relative to the injection phase, and that prevents a glitch in the injection clock signal.
H03K 5/00 - Manipulation of pulses not covered by one of the other main groups of this subclass
H03K 19/20 - Logic circuits, i.e. having at least two inputs acting on one outputInverting circuits characterised by logic function, e.g. AND, OR, NOR, NOT circuits
41.
PARTITION-AWARE BROADCAST OPERATION FILTERING IN A MULTIPROCESSOR SYSTEM
Various embodiments include techniques for processing broadcast operations in a computing system. Typically, broadcast operations are transmitted to all portions of a particular subsystem, such as all cache slices in a cache memory. As a result, a process that issues a broadcast operation can interfere with one or more other processes that access portions of the subsystem not assigned to the process. To prevent such interference, logic in the computing system filters broadcast operations so as to transmit the broadcast operation to only the relevant portions assigned to the process that issued the broadcast operation. The logic tracks acknowledgments from the relevant portions and, when all pending acknowledgments have been received, the logic transmits a single acknowledgement to the process that issued the broadcast operation. The logic is dynamically configurable such that the logic can change the portions of the subsystem assigned to each process as needed.
In various examples, determining emotional states for speech in conversational artificial intelligence (AI) and/or digital avatar systems and applications is descried herein. Systems and methods are disclosed that use one or more machine learning models to determine one or more emotional states associated with speech, where the machine learning model(s) may be trained using various processes. For instance, in some examples, the machine learning model(s) may be trained during a first training process to determine probabilities for distributions of values, where the distributions model different emotional states. For example, a distribution may include a first value for angry, a second value for happy, a third value for sad, and/or so forth. Additionally, or alternatively, in some examples, the machine learning model(s) may be trained during a second training process to more precisely determine the actual emotional states (and/or the probabilities) based on training data representing human feedback.
G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G10L 15/06 - Creation of reference templatesTraining of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G10L 25/57 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals
G10L 25/63 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
43.
SUPPLEMENTING SENSOR DATA FOR PROCESSING USING AI SYSTEMS AND APPLICATIONS
In various examples, processing sensor data using rankings for AI systems and applications is described herein. Systems and methods are disclosed that determine rankings for sensor data, where the rankings may then be used to process the sensor data. For instance, a device that generates sensor data using a sensor may determine rankings for various portions of the sensor data, such as by analyzing the sensor data and/or related sensor data to detect configured events. For example, if the sensor data includes image data, then the device may determine a respective ranking for different groups of frames that are associated with different events. A system(s) that then use the rankings when processing the sensor data using one or more processing tasks. For example, the system(s) may determine which processing tasks to use for processing different portions of the sensor data based at least on the rankings.
Apparatuses, systems, and techniques are to represent polygon data as pixels as part of a rasterization process. In at least one embodiment, a processor causes identification of pixels within a polygon based, at least in part, on one or more prefix sums of amounts of edges of pixels. covered by that polygon.
One embodiment of a method for classifying data includes processing the data via a trained machine learning model that includes a plurality of layers, where each layer generates one or more corresponding features, generating a first distribution of features based on the one or more corresponding features generated by each layer included in the plurality of layers, and determining a first class for the data based on a comparison of the first distribution of features with one or more predefined distributions of features that are associated with one or more classes.
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
46.
MACHINE-LEARNING-BASED ARCHITECTURE SEARCH METHOD FOR A NEURAL NETWORK
In at least one embodiment, differentiable neural architecture search and reinforcement learning are combined under one framework to discover network architectures with desired properties such as high accuracy, low latency, or both. In at least one embodiment, an objective function for search based on generalization error prevents the selection of architectures prone to overfitting.
G06N 3/04 - Architecture, e.g. interconnection topology
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
G06F 7/57 - Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups or for performing logical operations
G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Systems, devices, and methods for disaggregating networking components are provided. An example networking chassis includes a first disaggregated server device supported by the networking chassis that includes a first central processing unit (CPU) and a first graphics processing unit (GPU) coupled with the first CPU. The networking chassis further includes a first insertable switch module communicably coupled with the first disaggregated server device that includes first switching chipsets and a first fabric management controller coupled with the first switching chipsets. The first insertable switch module at least partially controls data transmission associated with the first disaggregated server device. The first GPU of the first disaggregated server device is isolated on the first disaggregated server device, supported on the first disaggregated server device in the absence of other GPUs, or is otherwise the only GPU on the first disaggregated server device so as to provide modularity in networking applications.
Visual language processors that include an image encoder configured to convert an image into a low-resolution feature map, a feature refinement network configured to upsample the low-resolution feature map into a high-resolution feature map, and a visual-language connector configured to map an image-level feature map and a region-level feature map both derived from the high-resolution feature map into an embedding space of a language encoder.
In various examples, a memory model may support multicasting where a single request for a memory access operation may be propagated to multiple physical addresses associated with multiple processing elements (e.g., corresponding to respective local memory). Thus, the request may cause data to be read from and/or written to memory for each of the processing elements. In some examples, a memory model exposes multicasting to processes. This may include providing for separate multicast and unicast instructions or shared instructions with one or more parameters (e.g., indicating a virtual address) being used to indicate multicasting or unicasting. Additionally or alternatively, whether a request(s) is processed using multicasting or unicasting may be opaque to a process and/or application or may otherwise be determined by the system. One or more constraints may be imposed on processing requests using multicasting to maintain a coherent memory interface.
G06F 12/1045 - Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
G06F 12/109 - Address translation for multiple virtual address spaces, e.g. segmentation
G06F 13/16 - Handling requests for interconnection or transfer for access to memory bus
A new level(s) of hierarchy—Cooperate Group Arrays (CGAs)—and an associated new hardware-based work distribution/execution model is described. A CGA is a grid of thread blocks (also referred to as cooperative thread arrays (CTAs)). CGAs provide co-scheduling, e.g., control over where CTAs are placed/executed in a processor (such as a GPU), relative to the memory required by an application and relative to each other. Hardware support for such CGAs guarantees concurrency and enables applications to see more data locality, reduced latency, and better synchronization between all the threads in tightly cooperating collections of CTAs programmably distributed across different (e.g., hierarchical) hardware domains or partitions.
The disclosure provides a cooling solution that evaluates the thermal environment of a computer component based on transient thermal responses of the computer component. The transient thermal responses are generated by measuring the temperature rise of the computer component over a designated amount of time for multiple “good” assemblies and multiple “bad” assemblies to determine a duration and allowable temperature rise needed to set a pass/fail criteria for different failure modes of cooling devices. A cooling device may not be operating as designed due to damage, needed maintenance, missing thermal interface material (TIM), improper installation, etc. From the transient thermal responses, a thermal problem, such as a malfunctioning fan, can be determined and a corrective action can be performed.
G01K 7/42 - Circuits effecting compensation of thermal inertiaCircuits for predicting the stationary value of a temperature
G01M 99/00 - Subject matter not provided for in other groups of this subclass
G05B 19/404 - Numerical control [NC], i.e. automatically operating machines, in particular machine tools, e.g. in a manufacturing environment, so as to execute positioning, movement or co-ordinated operations by means of programme data in numerical form characterised by control arrangements for compensation, e.g. for backlash, overshoot, tool offset, tool wear, temperature, machine construction errors, load, inertia
H05K 7/20 - Modifications to facilitate cooling, ventilating, or heating
52.
IMAGE STITCHING WITH COLOR HARMONIZATION FOR SURROUND VIEW SYSTEMS AND APPLICATIONS
In various examples, color statistic(s) from ground projections are used to harmonize color between reference and target frames representing an environment. The reference and target frames may be projected onto a representation of the ground (e.g., a ground plane) of the environment, an overlapping region between the projections may be identified, and the portion of each projection that lands in the overlapping region may be taken as a corresponding ground projection. Color statistics (e.g., mean, variance, standard deviation, kurtosis, skew, correlation(s) between color channels) may be computed from the ground projections (or a portion thereof, such as a majority cluster) and used to modify the colors of the target frame to have updated color statistics that match those from the ground projection of the reference frame, thereby harmonizing color across the reference and target frames.
In various examples, model-based processing to reduce reaction times for content streaming systems and applications is described herein. Systems and methods are disclosed that use one or more machine learning models to process image data representative of frames of an application, such as a gaming application, in order to generate updated imaged data representative of one or more updated frames that help reduce reaction times for users. For instance, the machine learning model(s) may update one or more visual characteristics associated with the frames, such as a contrast, a brightness, and/or a saturation associated with the frames. As described herein, the machine learning model(s) may be trained to update the frames in order to reduce the reaction times of users, such as by using one or more loss functions that measure loss in predicted reactions times and/or loss associated with visual characteristics of frames.
G06T 19/20 - Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
A63F 13/213 - Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
Apparatuses, systems, and techniques of generate a software program based on graph nodes that indicating hardware library functions to be performed. In at least one embodiment, a complier generates a software program that performs hardware library fuctions based on a graph having graph nodes. In at least one embodiment, kernels aregenerated that perform hardware library functions that indicted by graph nodes.
Embodiments of the present disclosure relate to a neural network architecture for generating high-quality, multi-modal trajectories as seed trajectories for optimization-based motion planners. The neural network architecture utilizes an observation encoder configured to encode environmental observations, and a noise prediction network configured to perform denoising based on the observations. The neural network architecture is configured to generate multiple seed trajectories in parallel by simultaneously running several instances of the noise prediction network. In contrast to conventional techniques, this architecture produces multiple high-quality seed trajectories at once, significantly enhancing the efficiency and speed of motion planning tasks.
One embodiment of a method for designing a system includes processing historical data associated with zero or more previous designs of the system using a trained machine learning model to predict a plurality of rewards for a plurality of designs of the system that are associated with different combinations of parameter values, and selecting, from the plurality of designs of the system, a first design of the system that is associated with a highest reward included in the plurality of rewards.
In various examples, systems and methods are described for performing 3D object detection based at least on motion cues. In some examples, systems can obtain data associated with a plurality of LiDAR scans. The systems can then determine a set of point trajectories for points that move from scan to scan over time using a message passing network (MPN) and identify points that are associated with given objects represented by the LiDAR scans. The points can then be annotated based at least on whether they are associated with static objects to train an object detector or similar models.
Systems, devices, and methods for disaggregating networking components are provided. An example datacenter rack includes a first networking chassis including a first disaggregated server device supported by the first networking chassis. The first disaggregated server device includes a first central processing unit (CPU) and a first graphics processing unit (GPU) coupled with the first CPU configured to perform computing operations associated with the first networking chassis. The example datacenter rack also includes a second networking chassis including a second disaggregated server device supported by the second networking chassis. The second disaggregated server device includes a second CPU and a second GPU coupled with the second CPU configured to perform computing operations associated with the first networking chassis.
Systems, devices, and methods for disaggregating networking components are provided. An example networking system includes a first network domain including a first networking chassis. The first networking chassis includes a first disaggregated server device supported by the first networking chassis including a first central processing unit (CPU); and a first graphics processing unit (GPU) coupled with the first CPU. The first CPU and the first GPU are configured to perform one or more computing operations associated with the first networking chassis. The networking system further includes a second network domain comprising a plurality of rack switches operably coupled with at least the first networking chassis.
Systems, devices, and methods for disaggregating networking components are provided. An example cable cartridge for network connections includes a housing defining a first portion configured to be coupled with at least a first disaggregated server device supported by a networking chassis. The first disaggregated server device includes a first central processing unit (CPU) and a first graphics processing unit (GPU) coupled with the first CPU. The first CPU and the first GPU are configured to perform one or more computing operations associated with the networking chassis. The housing defines a second portion configured to be coupled with at least a first insertable switch module supported by the networking chassis. The cable cartridge is configured to operably couple the first disaggregated server device and the first insertable switch module.
Described approaches provide for effectively and scalably using multiple GPUs to build and probe hash tables and materialize results of probes. Random memory accesses by the GPUs to build and/or probe a hash table may be distributed across GPUs and executed concurrently using global location identifiers. A global location identifier may be computed from data of an entry and identify a global location for an insertion and/or probe using the entry. The global location identifier may be used by a GPU to determine whether to perform an insertion or probe using an entry and/or where the insertion or probe is to be performed. To coordinate GPUs in materializing results of probing a hash table a global offset to the global output buffer may be maintained in memory accessible to each of the GPUs or the GPUs may compute global offsets using an exclusive sum of the local output buffer sizes.
G06F 16/28 - Databases characterised by their database models, e.g. relational or object models
H04L 9/06 - Arrangements for secret or secure communicationsNetwork security protocols the encryption apparatus using shift registers or memories for blockwise coding, e.g. D.E.S. systems
62.
DYNAMICALLY HARDENING COMMUNICATIONS HAVING INSECURE PROTOCOLS
In various examples, communications having insecure protocols are dynamically hardened. For example, communications that are formatted in an outdated or otherwise insecure version of a protocol (e.g., sent by a device aged out of a service window) may be isolated within a network, converted to an updated protocol format, or any combination thereof. These systems and methods may be implemented on a general purpose network device (e.g., a hub of a Local Area Network (LAN)).
In various examples, systems and methods described herein may cause a machine to establish a network connection with a network access device based at least on map data indicating to use the network access device for network connectivity at a location of the machine. In some examples, the map data may be generated based at least on one or more network performance scores associated with one or more network access devices disposed in the environment. In some instances, the network performance score(s) may be determined based at least on one or more wireless network signals transmitted by the network access device(s) and obtained using the machine and/or another machine. The network performance score(s) may, in some examples, be indicative of a signal strength associated with the network access device(s) at the location of the machine.
A device includes a printed circuit board (PCB) including a signal trace electrically coupled to a signal via. The device further includes a coaxial ground shield configured to reduce signal interference with respect to the signal via. The coaxial ground shield includes a ground via formed in the PCB substantially surrounding the signal via and substantially coaxial with the signal via. The coaxial ground shield further includes metal plating on a wall of the ground via. The metal plating is electrically coupled to a ground plane of the PCB. The coaxial ground shield further includes resin at least partially filling the ground via.
In various examples, using spatial relationships for animation retargeting in digital avatar systems and applications is described herein. Systems and methods are disclosed that determine constraints using first points (e.g., first vertices) associated with joints and/or a mesh of a source character and second points (e.g., second vertices) associated with joints and/or a mesh of a target character. As described herein, the constraints may include, but are not limited to, one or more of deformation constrains, interaction constraints, feet constraints, and angle constraints. Systems and methods are further disclosed that then use the constraints when performing optimization for animation retargeting of the target character. In some examples, the optimization is performed in vertex space, performed with respect to joints (e.g., rotations and/or transformations of the joints), and/or performed using one or more techniques, such as gradient descent.
In a GPU design, “launching a worker” is de-coupled from “assigning a work item” in a work distributor, and new handshake mechanisms between a worker and the work-distributor is provided for work assignment, in order to provide persistent kernel functionality. In example embodiments, software specifies the work that has to be done, hardware selects a variable number of workers based on available resources, and a hardware scheduler handshaking with the executing workers assigns more work as previously assigned work is completed and/or more resources become available.
The present invention relates to the technical field of liquid cooling systems. Disclosed is a liquid-cooling connection device, comprising a carrier, a male connector, a female connector, a reset member and a locking member, wherein the carrier is of a hollow shell-shaped structure, and is sleeved outside the male connector, and an outer surface of the male connector is spaced apart from an inner wall of the carrier; two ends of the male connector both pass through the carrier, a limiting ring is fixedly connected to the outer surface of the male connector that is located in the carrier, a clearance groove is provided in the inner wall of the carrier, the groove bottom of the clearance groove is spaced apart from the limiting ring, and the length of the clearance groove is larger than the thickness of the limiting ring; and one end of the male connector is in threaded connection with the female connector by means of a threaded groove provided in the male connector, and a flow cavity is provided at the groove bottom of the threaded groove. By means of the connection between the male connector and the carrier, the liquid-cooling connection device enlarges the application range, and has a low manufacturing cost; and the disassembly and replacement of structures are convenient, production and maintenance costs are low, and the liquid-cooling connection device is easy to promote and use.
An on-chip network (NoC) is a critical component of a GPU, CPU, network switch, or accelerator. NoC latency and energy is reduced by fabricating wires (conductive paths) on an integrated circuit die not only horizontally and vertically, but also diagonally between the network nodes. The diagonal wires may be fabricated on separate routing layers than the horizontal and vertical wires. When the network nodes are arranged in a two-dimensional array, the diagonal wires reduce the latency of an example packet transfer from a network node at position (0,0) to another network node at position (3,3) to three diagonal hops compared with three horizontal and three vertical hops without diagonal wires, reducing the number of router delays to four compared with seven. Overall, the latency and energy of an on-chip network may be reduced by about 40% for diagonal traffic and about 20% on average.
An on-chip network (NoC) is a critical component of a GPU, CPU, network switch, or accelerator. The network nodes may be arranged in a two-dimensional array with each network node coupled to neighboring network nodes vertically and horizontally, with or without diagonal connections. Conventional routers within network nodes are synchronous, taking from 1-10 clock cycles to determine an output port, arbitrate between virtual and physical channels, and account for credits. In contrast, in an embodiment, transmission of a packet between network nodes often occurs in less than one clock cycle because the handshake protocol and the circuitry are not synchronized using a clock signal. When implemented using asynchronous logic, the routing delay and power are reduced. The channel latency is the minimum time needed to drive the physical traces. Such an asynchronous NoC may reduce latency by a factor of two or more compared with a synchronous NoC.
Various embodiments include techniques for performing memory operations in a computing system. A processing unit in the computing system performs memory operations by accessing memory using two concurrent memory address maps (AMAPs). The processing unit access memory via a first fine-grain distributed AMAP in order to access memory with high-bandwidth without imposing strict ordering of memory operations. The processing unit access memory via a second AMAP source-ordered non-distributed AMAP in order to access memory where memory synchronization latency is more important than memory bandwidth. By accessing memory via the two concurrent AMAPs, the processing unit can select between high-bandwidth memory access and source-ordered memory access interchangeably, depending on which is desired for each memory operation. Further, the fine-grain distributed AMAP and the source-ordered non-distributed AMAP can maintain synchronization and coherency between one another concurrently without software intervention, thereby alleviating the burden on the application programmer to manage the two AMAPs.
In various examples, a technique for performing a mathematical reasoning task includes inputting a first prompt that includes (i) a set of example mathematical problems, (ii) example masked solutions to the example mathematical problems, and (iii) a mathematical problem into a first machine learning model, wherein each masked solution includes a set of symbols as substitutes for a set of numbers in a ground-truth solution for a corresponding example mathematical problem. The technique also includes generating, via execution of the first machine learning model based on the first prompt, a set of candidate masked solutions to the mathematical problem. The technique further includes inputting a second prompt that includes (i) the mathematical problem and (ii) at least one masked solution into a second machine learning model and generating, via execution of the second machine learning model based on the second prompt, a solution to the mathematical problem.
Self-supervised mechanisms to evaluate speech quality, and self-supervised speech enhancement, based on the quantization error of a vector-quantized variational autoencoder that utilize clean speech with domain knowledge of speech processing incorporated into the model design to improve correlation with real quality scores; and a self-distillation mechanism combined with adversarial training.
G10L 25/60 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocodersCoding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 19/038 - Vector quantisation, e.g. TwinVQ audio
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
Imitation learning, or artificial intelligence-based learning from demonstration, aims to acquire an agent policy by observing and mimicking the behavior demonstrated in expert demonstrations. Imitation learning can be used to generate reliable and robust learned policies in a variety of tasks involving sequential decision-making, such as autonomous driving and robotics tasks. However, current imitation learning solutions are limited in their ability to generalize states or goals unseen from the expert's demonstrations. The present disclosure integrates a diffusion model into generative adversarial imitation learning, which, in terms of prior solutions, can provide superior performance in generalizing to states or goals unseen from the expert's demonstrations, provide data efficiency for varying the amounts of available expert data, and capture more robust and smoother rewards.
At least one of the various embodiments is directed towards a computer-implemented method for generating trained artificial neural networks. The method includes, for each model layer included in a trained model, training one or more student model layers to mimic the model layer, for a first target device included in a plurality of target devices, generating one or more candidate architectures based on a constrained optimization problem and the one or more trained student model layers, training the one or more candidate architectures on a set of calibration data, selecting a first candidate architecture included in the one or more candidate architectures that is associated with a least amount of error, and performing a plurality of fine-turning training operations on the first candidate architecture to generate a first trained student model.
Apparatuses, systems, and techniques for issuing entitlement tokens to a root of trust allowing the root of trust to take certain action(s) with respect to a component that it secures, for example, to affect a change in the software and/or features available to the component it secures. In some embodiments, the entitlement token issuance process may involve receiving an attestation report corresponding to a root of trust of a computing system, wherein the attestation report is cryptographically signed using a private key unique to the root of trust, verifying the attestation report using a public key corresponding to the private key, and based at least upon successful verification of the attestation report, issuing an entitlement token for the root of trust allowing the root of trust to take one or more actions with respect to a system component secured by the root of trust.
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
A method for countering digital impersonation and unauthorized digital or audio usage due to a rise of deep-fake technologies is described. The method employs biometric authentication to ensure user authenticity in real-time during avatar or voice recording. Unique watermarks are generated at predetermined intervals and embedded into audio and/or video outputs. Device integrity can be additionally verified through device attestation, verifying both hardware and firmware integrity of the visual and audio systems. The device attestation may generate a device authentication key which can be combined with a user authentication key based on the biometric authentication to generate a combined watermark. Accordingly, real-time verification tools allow for the authentication of the watermark, ensuring ongoing content authenticity.
An integrated circuit package includes two or more discrete semiconductor dies coupled to a package substrate. The dies include a primary die and at least one secondary die. The primary die includes an external data transfer interface, a direct memory access (“DMA”) controller, a primary cross-die bridge, and at least one primary die-to-die interface. At least one of the secondary dies includes a secondary test controller, a secondary cross-die bridge, and a secondary die-to-die interface. A secondary die is operable to send a test data read request to a host system and to receive a test data read response therefrom. The secondary die is further operable to cause a test to be performed on circuitry within the die responsive to contents in the test data read response.
Techniques for executing trained machine learning models comprise presenting input data as a first activation tensor to a first layer of a trained machine learning model, compressing the first activation tensor using a first projection matrix that corresponds to the first layer of the trained machine learning model to produce a compressed first activation tensor, generating a first output tensor by multiplying the compressed first activation tensor and a first pre-computed weight matrix that corresponds to the first layer of the trained machine learning model, presenting the first output tensor as a second activation tensor to a second layer of the trained machine learning model, compressing the second activation tensor using a second projection matrix to produce a compressed second activation tensor, generating a second output tensor by multiplying the compressed second activation tensor and a second pre-computed weight matrix, and generating output data based on the second output tensor.
A first thread is executed in a first pipeline of a first core of an integrated circuit (IC). The first core includes a first set of hardware structures. Response to a command to operate the IC with multiple cores, the first pipeline is flushed. The first core is partitioned to obtain a second core and a third core. The first pipeline is partitioned to obtain a second pipeline and a third pipeline. The first set of hardware structures is partitioned to obtain a second set of hardware structures and a third set of hardware structures. The first thread is executed on the second pipeline of the second core of the IC, the second core including the second set of hardware structures. A second thread is executed on the third pipeline of the third core of the IC, the third core including the third set of hardware structures.
Approaches presented herein provide for semantic data matching, as may be useful for selecting data from a large unlabeled dataset to train a neural network. For an object detection use case, such a process can identify images within an unlabeled set even when an object of interest represents a relatively small portion of an image or there are many other objects in the image. A query image can be processed to extract image features or feature maps from only one or more regions of interest in that image, as may correspond to objects of interest. These features are compared with images in an unlabeled dataset, with similarity scores being calculated between the features of the region(s) of interest and individual images in the unlabeled set. One or more highest scored images can be selected as training images showing objects that are semantically similar to the object in the query image.
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
G06F 18/2113 - Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
G06F 18/214 - Generating training patternsBootstrap methods, e.g. bagging or boosting
G06F 18/22 - Matching criteria, e.g. proximity measures
G06V 30/262 - Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
Disclosed are systems and techniques that may generate task-oriented assistant responses to natural language requests of a user. The techniques include receiving a natural language request of a user and generating a task-oriented assistant response based on the natural language request. Generating the task-oriented assistant response includes converting the natural language request into a first set of tokens using a first machine learning model, applying a large language model (LLM) to the first set of tokens to obtain dialogue state information, modifying the dialogue state information using system state information, and determining a natural language response using the LLM and the modified dialogue state information, where the natural language response is the task-oriented response.
Approaches presented herein provide for the matching and alignment of features in different instances of sensor data captured for an environment. At least one embodiment provides for accurate identification of matching landmarks between two or more tracks obtained from sensor-equipped machines. Track information can be collected to identify a number of landmarks within a region, and edges can be determined between landmarks that are within a maximum or determined distance from one another, forming edges that extend from one landmark to other landmarks within that distance to create a landmark graph. Landmark graphs for multiple tracks may be compared to identify corresponding edges. A set of corresponding edges for individual landmarks can be selected and counted to determine whether the edges between the different tracks satisfy a correspondence criterion or exceeds a correspondence threshold value, which is indicative of a matching landmark between the different tracks.
Apparatuses, systems, and techniques of generate a software program based on graph nodes that indicating hardware library functions to be performed. In at least one embodiment, a complier generates a software program that performs hardware library functions based on a graph having graph nodes. In at least one embodiment, kernels are generated that perform hardware library functions that indicted by graph nodes.
Techniques for compressing a machine learning mode include executing a first trained machine learning model on training data to identify one or more activation tensors associated with at least one layer of the trained machine learning model; for each pairing of a first activation tensor included in the one or more activation tensors and a different fidelity metric included in a plurality of fidelity metrics, generating a corresponding partially compressed machine learning model; identifying a first projection matrix corresponding to the first activation tensor based on the plurality of corresponding partially compressed machine learning models; generating a compressed machine learning model by at least multiplying the first projection matrix and a corresponding weight matrix; and generating a retrained compressed machine learning model by at least retraining the corresponding weight matrix while keeping the first projection matrix static.
In various examples, feature identification using language models for autonomous and semi-autonomous systems and applications is described herein. Systems and methods described herein may use a language model(s) to determine information associated with features, such as surface markings, within an environment. For example, sensor data may be used to generate one or more images or other sensor data representations corresponding to an environment. The image(s) may then be processed to generate input data (e.g., input tokens) that is applied to the language model(s). Based at least on processing the input data, the language model(s) may be trained to output data (e.g., output tokens) representing information associated with one or more features. Additionally, the output data may be used to determine the information associated the feature(s) within the environment, where the information may then be used to update a map and/or navigate one or more machines within the environment.
A first thread is executed in a first pipeline of a first core of an integrated circuit (IC). The first core includes a first set of hardware structures. A second thread is executed in a second pipeline of a second core of the IC. The second core includes a second set of hardware structures. In response to a command to operate the IC with a unified core, the first core is combined with the second core to obtain the unified core. To unify the first core, the first pipeline is unified with the second pipeline to obtain a unified pipeline, and the first set of hardware structures is unified with the second set of hardware structures to obtain a unified set of hardware structures. A single thread is executed in the unified pipeline of the unified core using the unified set of hardware structures.
Apparatuses, systems, and techniques to enhance video are disclosed. In at least one embodiment, one or more neural networks are used to create a higher resolution video using upsampled frames from a lower resolution video.
G06F 7/57 - Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups or for performing logical operations
A63F 13/50 - Controlling the output signals based on the game progress
Embodiments of the present disclosure relate to a general image prior based on Langevin diffusion. An original representation (image, 3D model, audio) is optimized to resemble a data distribution learned by a trained diffusion model. The original representation may be incomplete and is completed by the optimization process. In an embodiment, the diffusion model may be trained to use an additional conditioning input, such as a text prompt. The diffusion model receives a noisy latent as input and generates a denoised output, such as an image. Generally, the diffusion model samples the learned data distribution to produce the output. The conditioning input provides additional constraint that causes the output to be “nudged” towards the learned data distribution. An example synthesis problem is to generate a panorama image that is much larger compared with images used to train the diffusion model.
In various examples, high-precision semantic image editing for machine learning systems and applications are described. For example, a generative adversarial network (GAN) may be used to jointly model images and their semantic segmentations based on a same underlying latent code. Image editing may be achieved by using segmentation mask modifications (e.g., provided by a user, or otherwise) to optimize the latent code to be consistent with the updated segmentation, thus effectively changing the original, e.g., RGB image. To improve efficiency of the system, and to not require optimizations for each edit on each image, editing vectors may be learned in latent space that realize the edits, and that can be directly applied on other images with or without additional optimizations. As a result, a GAN in combination with the optimization approaches described herein may simultaneously allow for high precision editing in real-time with straightforward compositionality of multiple edits.
In various examples, estimated field of view or gaze information of a user may be projected external to a vehicle and compared to vehicle perception information corresponding to an environment outside of the vehicle. As a result, interior monitoring of a driver or occupant of the vehicle may be used to determine whether the driver or occupant has processed or seen certain object types, environmental conditions, or other information exterior to the vehicle. For a more holistic understanding of the state of the user, attentiveness and/or cognitive load of the user may be monitored to determine whether one or more actions should be taken. As a result, notifications, AEB system activations, and/or other actions may be determined based on a more complete state of the user as determined based on cognitive load, attentiveness, and/or a comparison between external perception of the vehicle and estimated perception of the user.
G06V 20/59 - Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
B60W 30/09 - Taking automatic action to avoid collision, e.g. braking and steering
B60W 30/095 - Predicting travel path or likelihood of collision
B60W 40/08 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to drivers or passengers
B60W 50/14 - Means for informing the driver, warning the driver or prompting a driver intervention
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
91.
USING IMPORTANCE RESAMPLING TO REDUCE THE MEMORY INCOHERENCE OF LIGHT SAMPLING
Devices, systems, and techniques to incorporate lighting effects into computer-generated graphics. In at least one embodiment, a virtual scene comprising a plurality of lights is rendered by randomly sampling a set of lights from among the plurality of lights prior to rendering a frame of graphics. A subset of the set of lights is selected and used to render pixels within one or more portions of the frame.
One or more validity checks that model one or more aspects of human physiology may be applied to frames of detected human features to detect and respond to the presence of faults. Example validity checks include human feature constraints derived from the kinematics of human motion, anatomical and spatial constraints, consistency across detection modalities, and/or others. The present techniques may be utilized to validate human features detected by various computer vision tasks, such as those involving pose estimation, facial detection, gesture recognition, and/or activity monitoring, to name a few examples. In an example embodiment involving the use of a DMS to control the activation, operation, and/or deactivation of autonomous driving the validity checks may be performed on ASIL-rated hardware, enabling one or more components of the DMS pipeline to run on hardware that need not be ASIL-rated, obviating the need for at least some built-in hardware tests and/or continuous monitoring.
B60W 40/08 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to drivers or passengers
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
93.
MULTI-DIMENSIONAL BINNING FOR HIERARCHICAL PARTITIONING
In various examples, spatial elements of a scene may be projected into multi-dimensional bins that correspond to a spatial partitioning of the scene to determine assignments between the spatial elements and the multi-dimensional bins. A partition of the spatial elements may be determined using the assignments and a spatial element may be assigned to a node corresponding to a hierarchical partitioning of the spatial elements based on the partition. To determine the partition, candidate split planes may be determined with respect to the multi-dimensional bins, and a split plane that defines the partition may be selected from the candidate split planes. The assignments and the multi-dimensional bins may also be used to determine subpartitions of the partition. For example, the assignments may be used to determine the subpartitions with respect to a subset of the multi-dimensional bins that corresponds to the partition.
Various embodiments include techniques for performing memory store operations in a computing system. A memory management unit (MMU) receives various types of store operations from a processor, including unordered store operations, weak ordered store operations, and strong ordered store operations. The MMU performs virtual address to physical address translations for the store operations and forwards the translated store operations for execution. The MMU can perform translations for and forward unordered store operations and weak ordered store operations at any time. By contrast, the MMU can delay translations for and forwarding of strong ordered store operations while any prior ordered store operations are pending. In this manner, the MMU can enforce ordering of weak ordered store operations to be translated and executed prior to a subsequent strong ordered store operation. These techniques allow the processor to continue to perform other operations while ordered store operations are pending in the MMU.
An optical communication system includes at least one wave division multiplex (WDM) transmitter configured to generate WDM signals in multiple channels on a light guide, and a Code Division Multiple Access (CDMA) symbol generator coupled to modulate output of the WDM transmitter at a frequency below a noise floor of the WDM signals.
Apparatuses, systems, and techniques to use one or more neural networks to perform one or more tasks. In at least one embodiment, said one or more neural networks use one or more textual descriptions to generate one or more three-dimensional (3D) models of one or more first objects based, at least in part, on two or more images of one or more second objects from two or more viewpoints.
The technical solutions disclosed are directed to a multi-state reconciliation finite state automata operator framework. The system and methods can identify one or more states between an initial state and a final state of an application executed by a service and one or more parameters corresponding to timing of implementation of the one or more states. The systems and method can provide a model configured to manage progress corresponding to the one or more states of the application to determine, using a first matrix, a current state of the one or more states of the application and determine, using a second matrix, a parameter of the one or more parameters corresponding to a timing of implementation of the current state. The systems and method can provide to the service an indication of the progress of the application.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
computer hardware for use in enhancing the performance of data centers supporting software applications for scaling and development of artificial intelligence models and architectures; computer hardware for use in enhancing the performance of data centers supporting software applications using artificial intelligence for machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics, business intelligence and computer vision; integrated circuits, semiconductors and computer chipsets for use in enhancing the performance of data centers supporting software applications for scaling and development of artificial intelligence models and architecture; integrated circuits, semiconductors and computer chipsets for use in enhancing the performance of data centers supporting software applications using artificial intelligence for machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics, business intelligence and computer vision; computer hardware; integrated circuits, semiconductors and computer chipsets; graphics processing units (GPUs); embedded processors for computers; computer networking hardware; computer networking switches; computer hardware for communication among central processing units (CPUs); data processing units (DPUs); computer hardware for enabling connections among central processing units (CPUs), servers and data storage devices design and development of computer hardware for use in enhancing the performance of data centers supporting software applications for scaling and development of artificial intelligence models and architecture; design and development of computer hardware for use in enhancing the performance of data centers supporting software applications using artificial intelligence for machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics, business intelligence and computer vision; design and development of integrated circuits, semiconductors and computer chipsets for use in enhancing the performance of data centers supporting software applications for scaling and development of artificial intelligence models and architecture; design and development of integrated circuits, semiconductors and computer chipsets for use in enhancing the performance of data centers supporting software applications using artificial intelligence for machine learning, deep learning, natural language generation, statistical learning, supervised learning, un-supervised learning, data mining, predictive analytics, business intelligence and computer vision; design and development of computer hardware; design and development in the field of computer networking hardware; design and development in the field of computer datacenter architecture; design and development of computer hardware, namely, integrated circuits, semiconductors, computer chipsets, graphics processing units (GPUs), embedded processors for computers, computer networking hardware, computer networking switches, computer hardware for communication among central processing units (CPUs), data processing units (DPUs), and data storage systems, computer hardware for enabling connections among central processing units (CPUs), servers and data storage devices; computer technical support services, namely, management and optimization of software and hardware for data centers
99.
NAVIGATION ROAD-GRAPH AND PERCEPTION LANE-GRAPH MATCHING
In various examples, embodiments are directed to navigation-road and perception-lane matching for autonomous and semi-autonomous systems and applications. In this regard, lane-road matching is performed using geometric similarity to generate effective lane-road mappings for use in lane planning and decision making, among other things. In some embodiments, road data representing at least one road section and lane data representing a lane associated with a location of an ego-machine are received. Thereafter, a determination is made that one or more consecutive road sections match a lane based at least on a geometric similarity between the lane and the one or more consecutive road sections. A representation of the lane being mapped to the one or more consecutive road sections is generated based at least on the determining that the one or more consecutive road sections matched the lane.
Apparatuses, systems, and techniques to generate text from an audio signal. In at least one embodiment, one or more neural networks are used to generate text from an audio signal, wherein the one or more neural networks comprise one or more portions to each identify one or more features of a corresponding time period of the audio signal to be used to generate text corresponding to one or more other time periods of the audio signal.