41 - Education, entertainment, sporting and cultural services
Goods & Services
Conducting of commercial events in the nature of trade
shows; arranging of commercial events in the nature of trade
shows; arranging and conducting trade shows in the fields of
technology, business, software development, hardware
development, artificial intelligence, machine learning, deep
learning, large language models (LLMs), natural language
generation, statistical learning, supervised learning,
un-supervised learning, predictive analytics, business
intelligence, accelerated computing, edge computing, high
performance computing, computer graphics hardware, graphics
processing units (gpus), electronics, data science,
autonomous machines, robotics, virtual reality, augmented
reality, cybersecurity, data storage, cloud computing. Arranging and conducting educational conferences, seminars,
classes, workshops, courses and exhibitions and providing
non-downloadable webinars, all in the fields of technology,
business, software development, hardware development,
artificial intelligence, machine learning, deep learning,
large language models (LLMs), natural language generation,
statistical learning, supervised learning, un-supervised
learning, predictive analytics, business intelligence,
accelerated computing, edge computing, high performance
computing, computer graphics hardware, graphics processing
units (gpus), electronics, data science, autonomous
machines, robotics, virtual reality, augmented reality,
cybersecurity, data storage, cloud computing.
2.
MASK GENERATION FOR FEATURE DETECTION IN AUTONOMOUS AND SEMI-AUTONOMOUS SYSTEMS AND APPLICATIONS
In various examples, systems and methods are described that may be used to generate a mask of a geographic area and corresponding vector representations of one or more environmental features included in the area. In some embodiments, the method and system may generate or obtain a first tile image representing a portion of a path surface corresponding to a geographic area. One or more features may be extracted from the data where the extracted features may indicate environmental characteristics associated with the path surface—e.g., lane boundaries, medians, traffic signs, signals, etc. Additionally, the method or system may generate a mask corresponding to the tile image and vector representations of the portion of the geographic area. In some embodiments, the locations of the environmental features associated with the mask may be reconciled with one or more previously generated masks that include some of the same environmental features.
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
3.
SYNCHRONIZING MEMORY MANAGEMENT UNITS IN MULTI-DIELET PROCESSOR ARCHITECTURES
This disclosure describes supporting distributed graphics and compute engines in a multi-dielet parallel processing system, such as, for example, a multi-dielet graphics processing unit (GPU), architectures and synchronizing memory management in such architectures. Respective dielets each has a memory management unit (MMU). The processing of at least one memory-related message type is serialized by a designated MMU for messages originated at any dielet, and the processing of at least some memory-related message types is performed locally on the originating dielets.
In various examples, optical flow-based algorithms may be used to detect objects in an environment by computing displacement fields for images captured using asynchronous cameras. As an example, an asynchronous set of cameras (e.g., two or more cameras) may capture a series of asynchronous images of an environment. Additionally, in some examples, the cameras may be positioned at different locations and capture different fields of view of the environment. Based at least on the differing image capture times and/or the differing fields of view of the images, image pixels corresponding to the same, physical locations in the environment may move locations between images of the series of images. The disclosed systems and methods may use optical flow algorithms to compute scores associated with the displacement/movement of the pixels throughout the series of images, as well as use these scores to detect objects in the environment.
Large language models (LLMs) learn via machine learning to understand and generate human-like text, and thus are power when used for various language-based tasks, such as text summarization, translation, and content generation. However, to provide superior performance, the LLM is often of a considerable model size and requires high inference costs. To mitigate the size and execution costs of LLMs, methods have been developed to specifically compress LLMs. However, most existing methods either incur significant accuracy degradation compared to uncompressed models or have high training time, while their adaptability is often constrained by a limited range of hardware-supported compression formats. The present disclosure provides error compensation for a compressed LLM in a training free manner that provides flexibility for diverse performance needs.
A method for forming a printed circuit board includes: forming on a substrate a first conductive layer for a first edge connector pin and a first conductive layer for a second edge connector pin, wherein the first conductive layer for the first edge connector pin and the first conductive layer for the second edge connector pin are electrically coupled to one another via a first conductive layer for an electrical bridging element; electroplating a second conductive layer onto both the first conductive layer for the first edge connector pin and the first conductive layer for the second edge connector pin via a plating current conductor; and removing at least a portion of the electrical bridging element to electrically separate the first edge connector pin from the second edge connector pin.
H05K 1/11 - Printed elements for providing electric connections to or between printed circuits
H05K 3/00 - Apparatus or processes for manufacturing printed circuits
H05K 3/04 - Apparatus or processes for manufacturing printed circuits in which the conductive material is applied to the surface of the insulating support and is thereafter removed from such areas of the surface which are not intended for current conducting or shielding the conductive material being removed mechanically, e.g. by punching
7.
TECHNIQUES FOR MODIFYING AN EXECUTABLE GRAPH TO PERFORM A WORKLOAD ASSOCIATED WITH A NEW TASK GRAPH
Techniques to modify executable graphs to perform different workloads. In at least one embodiment, an executable version of a first task graph is modified by applying a non-executable version of a second task graph to executable version of first task graph so that executable version of first task graph can perform a second workload of non-executable version of second task graph.
Raytracing and pathtracing systems use vertex compression including per-component shifts, unsigned arithmetic deltas, and base vertex shortening. Topology encodings include enhanced explicit vertex indexing and implicit triangle strip based indexing including left, right, start tokens, turn back, and degenerate tokens and including rotation. Substantial lossless compression ratio increases are realized, especially for highly quantized vertex data.
A generative framework enables transformation of a conventional Gaussian diffusion model for modeling heavy-tailed distributions, such as the data distributions typical of scientific applications. In an embodiment, the denoising model predicts short-term or long-term events based on input data (e.g., certain weather or financial variables). In an embodiment, the denoising model generates high resolution data, such as generating local weather forecasts or conditions from certain weather variables for a larger region.
Embodiments of the present disclosure relate to multi-view LIDAR perception with motion cues for autonomous and semi-autonomous machines and applications. A DNN may be used to detect objects, a navigable space, weather or surface conditions, artifacts, and/or other parts or features of an environment based on multiple views of LIDAR data from multiple time slices. The DNN may include multiple input channels for processing multiple views of sensor data from multiple time slices to provide motion cues, and the extracted features from the different time slices may be geometrically projected from a first 2D view to a second 2D view, combined with features that were extracted from the second 2D view, and applied to a subsequent stage of the DNN. The data generated by the DNN may be provided to the drive stack of an autonomous vehicle or other ego-machine to enable safe planning and control of the vehicle.
In various examples, self-supervised learning may be used to pre-train an encoder network of a masked prediction model to reconstruct masked regions of an input representation of 3D detections such as LiDAR point cloud(s). Spatial and/or temporal masking may be applied to a projected representation of 3D detections (e.g., a two-dimensional (2D) projection image), and the masked prediction model (e.g., a masked auto-encoder or joint-embedding predictive architecture) may be used to reconstruct a representation of the masked regions (e.g., reflection characteristic(s) stored in corresponding pixels or cells of the projected representation, a latent representation of the reflection characteristic(s)) during iterations of self-supervised learning. As such, the pre-trained encoder network of the masked prediction model may be used as a foundation model and fine-tuned with a task-specific output head or its pre-trained weights may be used to initialize a task-specific model.
Embodiments of the present disclosure relate to a machine performing one or more planning, navigation, or control operations based at least on one or more outputs of one or more neural networks in which the one or more outputs are computed based at least on the one or more neural networks processing sensor data generated using a plurality of perception sensors of the machine.
In various examples, virtual participant-based microservices for video conferencing applications and systems are provided. A virtual participant service provides a subject matter expert to participants of a conference session. A virtual participant may be presented within a video conferencing environment as a simulated meeting participant that other meeting participants may interact with using natural conversational language. The virtual participant service may include a virtual participant controller frontend service that interfaces with the video conferencing platform, an avatar manager to generate an avatar representing the virtual participant, and an LLM services gateway that functions as a microservices server for one or more LLM-based services that may be accessed through the virtual participant. The virtual participant service may use natural language processing to evaluate spoken requests for information and provide a response back to the human user participants of the conference channel using an animated avatar.
A parallel processing unit comprises a plurality of processors each being coupled to a memory access hardware circuitry. Each memory access hardware circuitry is configured to receive, from the coupled processor, a memory access request specifying a coordinate of a multidimensional data structure, wherein the memory access hardware circuit is one of a plurality of memory access circuitry each coupled to a respective one of the processors; and, in response to the memory access request, translate the coordinate of the multidimensional data structure into plural memory addresses for the multidimensional data structure and using the plural memory addresses, asynchronously transfer at least a portion of the multidimensional data structure for processing by at least the coupled processor. The memory locations may be in the shared memory of the coupled processor and/or an external memory.
G06F 12/0875 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
Disclosed are apparatuses, systems, and techniques that may use machine learning for implementing generative text-to-speech models. The techniques include identifying a mapping of speech characteristics (SC) on a target distribution of a latent variable using a non-linear transformation for at least a subset of the SC. Parameters of the non-linear transformation are determined using a neural network that approximates a statistics of the SC with a statistics predicted for the SC based on the identified mapping and the target distribution of the latent variable.
G10L 13/027 - Concept to speech synthesisersGeneration of natural phrases from machine-based concepts
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
16.
EFFICIENT DENOISING FOR RAY-TRACING SYSTEMS AND APPLICATIONS
In examples, a filter used to denoise shadows for a pixel(s) may be adapted based at least on variance in temporally accumulated ray-traced samples. A range of filter values for a spatiotemporal filter may be defined based on the variance and used to exclude temporal ray-traced samples that are outside of the range. Data used to compute a first moment of a distribution used to compute variance may be used to compute a second moment of the distribution. For binary signals, such as visibility, the first moment (e.g., accumulated mean) may be equivalent to a second moment (e.g., the mean squared). In further respects, spatial filtering of a pixel(s) may be skipped based on comparing the mean of variance of the pixel(s) to one or more thresholds and based on the accumulated number of values for the pixel.
Image inpainting aims to restore damaged regions of a target image. Because any plausible outcome could be considered valid for this task, reference-based image inpainting has been used in which a reference image (e.g. capturing substantially the same scene as the target image) guides the inpainting process, thereby increasing the probability that the target image is restored to its original state. However, current diffusion models used for image inpainting, even though conditioned on reference images, lack direct awareness of the relationships between the target and reference which results in a loss of faithfulness in the inpainted result. The present disclosure guide the inpainting process of a diffusion model with reference-target image correspondences as constraints, which can preserve the reference-target geometric relationships and thus enhance faithfulness of the inpainted target image to the reference image.
41 - Education, entertainment, sporting and cultural services
Goods & Services
Conducting of commercial events in the nature of trade
shows; arranging of commercial events in the nature of trade
shows; arranging and conducting trade shows in the fields of
technology, business, software development, hardware
development, artificial intelligence, machine learning, deep
learning, large language models (LLMs), natural language
generation, statistical learning, supervised learning,
un-supervised learning, predictive analytics, business
intelligence, accelerated computing, edge computing, high
performance computing, computer graphics hardware, graphics
processing units (gpus), electronics, data science,
autonomous machines, robotics, virtual reality, augmented
reality, cybersecurity, data storage, cloud computing. Arranging and conducting educational conferences, seminars,
classes, workshops, courses and exhibitions and providing
non-downloadable webinars, all in the fields of technology,
business, software development, hardware development,
artificial intelligence, machine learning, deep learning,
large language models (LLMs), natural language generation,
statistical learning, supervised learning, un-supervised
learning, predictive analytics, business intelligence,
accelerated computing, edge computing, high performance
computing, computer graphics hardware, graphics processing
units (gpus), electronics, data science, autonomous
machines, robotics, virtual reality, augmented reality,
cybersecurity, data storage, cloud computing.
19.
AUTO-REGRESSIVE AUTO-ENCODER FOR ARTISTIC MESH GENERATION
Automatic 3D content generation, particularly the generation of polygonal meshes, is useful for development of digital gaming, virtual reality, and filmmaking. Generative models in particular make 3D asset creation more accessible to non-experts. Some existing approaches rely on continuous 3D representations which lose the discrete face indices in triangular meshes during conversion and consequently require post-processing to extract triangular meshes which will then differ significantly from artist-created ones. More recently, attempts have been made to tokenize meshes into 1D sequences and leverage auto-regressive models for direct mesh generation, which can preserve the topology information and generate artistic meshes, but these methods are inefficient, result in accuracy loss, and cannot generalize beyond the training domain. The present disclosure provides an auto-regressive auto-encoder configured for artistic mesh generation, which can compress variable-length triangular meshes into fixed-length latent codes to enable training latent diffusion models conditioned on different modalities for improved generalization.
A buffer management system receives packet data associated with a first packet characteristic of a plurality of packet characteristics of a plurality of packets transmitted via a digital interface. The packet data is stored in a first input accumulator of a plurality of input accumulators of the buffer management system. The first input accumulator corresponds to the first packet characteristic. A threshold quantity of packet data is obtained from the first input accumulator responsive to determining that the threshold quantity of packet data is accumulated in the first input accumulator. The threshold quantity of packet data is stored in a line of a shared buffer for the plurality of packets transmitted via the digital interface. The shared buffer is associated with the plurality of packet characteristics.
Apparatuses, systems, and techniques are presented to determine optimal parameters for streaming content. In at least one embodiment, network characteristic information is fed as input to a neural network for inferring adjustments to transmission parameters that would be optimal for transmitting content.
H04L 41/16 - Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
In various examples, a three-stage pipeline is used to automate the generation of paired parts (or components) in assemblies. The pipeline includes a first contact surface extraction stage, in which a set of contact surfaces is extracted from a first part based on attributes identified by a vision language model (VLM) and/or another type of machine learning model from a visual and/or another representation of the first part. The pipeline also includes a shape completion stage, in which the contact surfaces are used to condition the operation of a diffusion model and/or another type of three-dimensional (3D) generative model in generating a shape for a second part that is complementary to the first part. The pipeline further includes a clearance specification stage, in which the shape of a given part is updated to meet a minimum clearance distance from the other part.
In various examples, generating and using interaction graphs for video information retrieval systems and applications is described herein. Systems and methods are disclosed that process videos generated using one or more image sensors in order to generate a graph that represents at least interactions between entities depicted by the videos. For instance, nodes of the graph may be associated with the entities—such as people and/or other objects—as well as attributes associated with the entities. Additionally, edges of the graph may be associated with interactions between the entities, times that the interactions occurred, and/or indications of which videos depict the interactions. Systems and methods are then further disclosed that use the graph to perform information retrieval associated with the videos. For instance, the graph may be used to identify relevant information associated with a query, where the information may then be used to generate a response.
Apparatuses, systems, and techniques for adaptive flow matching. In at least one embodiment, input is received, which includes one or more first variables at first scale. An encoder is used to encode the input to provide a base distribution at the first scale. The base distribution is associated with one or more second variables, and the one or more second variables include one or more variables absent from the one or more first variables. A perturbed base distribution is obtained based on the base distribution and an adaptive noise. A diffusion model is used to generate a target distribution at a second scale. The target distribution is associated with the one or more second variables. The second scale is finer than the first scale.
Apparatuses, systems, and techniques to indicate an extent, to which text corresponds to one or more images. In at least one embodiment, an extent to which text corresponds to one or more images is indicated using one or more neural networks and used to train the one or more neural networks.
In various examples, systems and methods are disclosed relating to generating a response from image and/or video input for image/video-based artificial intelligence (AI) systems and applications. Systems and methods are disclosed for a first model (e.g., a teacher model) distilling its knowledge to a second model (a student model). The second model receives a downstream image in a downstream task and generates at least one feature. The first model generates first features corresponding to an image which can be a real image or a synthetic image. The second model generates second features using the image as an input to the second model. Loss with respect to first features is determined. The second model is updated using the loss.
Embodiments of the present disclosure relate to defining tasks in scenes using delta information. In operation, some embodiments first receive or generate scene data. Some embodiments then generate delta information indicating one or more changes within the scene data. For example, responsive to user input, particular embodiments generate the delta information, such as a delta layer. Generating such delta information is useful in various applications such as robotics, simulation, graphics rendering, gaming, autonomous driving, or the like. For instance, with respect to robotics, some embodiments store data corresponding to the delta information as at least part of a task definition for a robotic task. Some embodiments then responsively cause or train one or more real-world robotic components represented by one or more virtual robotic components to perform a task in a real-world scene represented by a virtual scene based on the delta information and the task definition.
Disclosed are apparatuses, systems, and techniques that train and use trained language models to assist users with complex systems installation, troubleshooting, and/or maintenance. A method can include determining, responsive to data received from a real robot having one or more real sensors and operating in a real environment, that the real robot needs assistance to navigate from a current state of the real robot within the real environment, causing simulated data to be obtained from one or more simulated sensors within a simulated environment at least partially modeling the real environment, the one or more simulated sensors including at least one simulated sensor different from the one or more real sensors, and using the simulated data to control operation of the real robot within the real environment in order to navigate the real robot from the current state.
G05D 1/246 - Arrangements for determining position or orientation using environment maps, e.g. simultaneous localisation and mapping [SLAM]
G05D 1/247 - Arrangements for determining position or orientation using signals provided by artificial sources external to the vehicle, e.g. navigation beacons
G05D 101/15 - Details of software or hardware architectures used for the control of position using artificial intelligence [AI] techniques using machine learning, e.g. neural networks
29.
IMPORTANCE SAMPLING ENVIRONMENT MAPS FOR REAL-TIME PATH TRACING
Approaches presented herein provide systems and methods for path tracing using a set of textured spherical surfaces obtained from an importance map for an image. An image representation may be generated using the importance map and evaluated to identify a first set of nodes. The nodes may have associated values, such as luminance values, that may be used to subdivide the nodes into bins to maintain a weighed distribution for the associated values. An array of nodes may be generated for sampling and conversion to a three-dimensional direction that may be applied to one or more lighting effects.
Systems and methods in accordance with the present disclosure can prevent interference of memory operations being performed, for example, by partitioning memory access to safety related applications and non-safety related applications. In various examples, one or more circuits can assign, to a workload for execution on a system-on-a-chip (SoC) and according to a criticality of the workload, at least one threshold for bandwidth of a resource of the SoC. The one or more circuits can control execution of the workload according to the at least one threshold.
Disclosed are apparatuses, systems, and techniques that implement training and deployment of automatic transcription-assisted translation systems that use language models. The techniques include processing, using a first speech-to-text (S2T) model, a first input that includes a speech in a first language to generate a transcription of the speech. The techniques further include processing, using a second S2T model, a second input to generate a translation of the speech to a second language. The second input includes at least a representation of the speech, and the transcription of the speech.
Embodiments of the present disclosure provide systems and methods for fine-tuning a pretrained large language model (LLM) for generating hardware description language (HDL) code. In at least one embodiment, a first training dataset that includes correct-by-construction non-textual representation data samples is obtained, and the pretrained LLM is fine-tuned using the first training dataset to provide the fine-tuned LLM for generating HDL code.
A system may simulate human motion for human-robot interactions, such as may involve a handover of an object. Motion capture can be performed for a hand grasping and moving an object to a location and orientation appropriate for a handover, without a need for a robot to be present or an actual handover to occur. This motion data can be used to separately model the hand and the object for use in a handover simulation, where a component such as a physics engine may be used to ensure realistic modeling of the motion or behavior. During a simulation, a robot control model or algorithm can predict an optimal location and orientation to grasp an object, and an optimal path to move to that location and orientation, using a control model or algorithm trained, based at least in part, using the motion models for the hand and object.
In various examples, systems and techniques are directed to network-based heteroassociative retrieval-augmented generation (HRAG) for efficient augmentation of inputs into artificial intelligence models. Example techniques include storing documents in a network-based store (NBS) having multiple stages of matrix multiplication(s) and non-linear activation(s). Storing documents includes modifying parameter(s) of matrix multiplications of at least one of the stages. The example techniques further include processing, using the NBS, a query to obtain retrieved document(s) associated with the query and at least approximately reproducing stored document(s). The example techniques further include processing, using a language model, a prompt that is based at least on the query and the retrieved document(s).
In various examples, analyzing software architectural information using language models is described herein. Systems and methods are disclosed that parse architectural information associated with software—such as software architecture documents (SWADs), design documents, and/or source code—to generate relational diagrams associated with the architectural information. The systems and methods may then use the relational diagrams and one or more language models to analyze the architectural information. For instance, one or more prompts associated with analyzing the architectural information may be obtained, where an individual prompt is associated with performing one or more analysis tasks. The language model(s) may then process input data representing the prompt(s) along with at least a portion of the architectural information (e.g., determined using the relational diagrams) to determine information associated with the tasks.
Apparatuses, systems, and techniques to perform an application programming interface (API) to add one or more graph nodes to a software graph, wherein the API is to cause a semaphore wait node to be added to a software graph based, at least in part, on a dependency type indicated by the API. In at least one embodiment, one or more nodes are added to a graph in accordance to one or more dependency types.
Automatic volumetric quantification can be performed for various parameters of an object by providing volumetric data, such as three-dimensional image data, to at least one neural network. A network can extract features from the data that can be used to infer a point cloud representative of the surface of the object. One or more loss functions can be used to adjust the relevant network parameters. The network can also attempt to infer a segmentation mask for the object, indicating which data values correspond to the object of interest. Since the network performs the segmentation and point cloud generation in parallel, updates to the network parameters can impact the segmentation process, effectively constraining the segmentation based on the inferred shape of the object. Ensuring that the segmentation mask corresponds closely to the surface of the object can cause the segmentation process to be more accurate than conventional segmentation processes alone.
Systems and methods for automatically performing actions for a user or system in response to receiving natural language statements. The systems and methods use a large language model to transform the natural language statements into conditional logic for domain specific systems, which systems can perform the corresponding actions. In domains such as a data center or a smart city, the systems and methods provide an intelligent system that can identify conditional statements including conditions and corresponding actions in response to natural language statements. These conditional statements are interpreted as rules in the context of the domain, such that the methods and system ascertain that conditions are met and perform corresponding actions. If there are issues interpreting the conditionals, the systems and methods can use a feedback loop to gather additional information.
Apparatuses, systems, and techniques to enable verification of content, such as media content. Hashes of content can be digitally signed and stored to a distributed ledger, such that a source of content can be verified and any modification determined.
H04L 9/06 - Arrangements for secret or secure communicationsNetwork security protocols the encryption apparatus using shift registers or memories for blockwise coding, e.g. D.E.S. systems
H04L 9/32 - Arrangements for secret or secure communicationsNetwork security protocols including means for verifying the identity or authority of a user of the system
H04L 9/00 - Arrangements for secret or secure communicationsNetwork security protocols
40.
Video encoding using reconstruction of spatially decimated frames
Approaches presented herein provide for the high quality, high resolution reconstruction of a sequence of decimated images, such as may be useful for remote desktop applications. The video frames can be sub-sampled or decimated such that each encoded frame only includes a fraction (e.g., ¼) of the total pixel values for the full resolution frame. A server can analyze the current and previous video frames at full resolution to determine motion or actionable changes, and can apply a lowpass filter to those values based on the type of interpolation to be performed on the client. Static values can remain where no motion is detected. When a client receives the decimated and encoded video frames, the client can determine areas of motion and can perform bilinear interpolation for only those portions of the image where motion is detected, and can otherwise perform weaving of the static pixel values received over a limited sequence of decimated video frames.
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
H04N 19/117 - Filters, e.g. for pre-processing or post-processing
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/182 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
41.
Two-phase cooling structure cooling within a single-phase liquid cooled server
A system includes one or more two-phase cooling loops including one or more coolant pumps and one or more cooling condensers integrated within one or more servers to cause one or more first server components to be cooled using a single-phase coolant and to cause one or more second server components consuming more power than the one or more first server components to be cooled using a two-phase coolant. The one or more cooling condensers are to condense and cool the two-phase coolant using the single-phase coolant.
09 - Scientific and electric apparatus and instruments
Goods & Services
Artificial intelligence supercomputers; high performance
computers and computer hardware for artificial intelligence,
machine learning, deep learning, natural language
generation, statistical learning, supervised learning,
un-supervised learning, data mining, predictive analytics
and business intelligence; high performance computers and
computer hardware with specialized features for software
development; high performance computers and computer
hardware with specialized features for developing, testing,
and validating artificial intelligence models and software
applications; high performance computers and computer
hardware with specialized features for data analytics, data
management, data integration, data processing, and data
visualization; high performance computer hardware with
specialized features for development of edge applications;
high performance computers and computer hardware with
specialized features for development of robotics, smart
cities, and computer vision solutions.
43.
ARTIFICIAL INTELLIGENCE BASED RECOGNITION OF DEVICE ENVIRONMENT
Apparatuses, systems, and techniques for environment recognition based on device input. At least audio data captured by a processing device of a user device is provided as input to an environment recognition model to generate an output representative of a predicted environment of the user device. One or more settings of the user device is updated based at least in part on the predicted environment of the user device.
G06V 20/40 - ScenesScene-specific elements in video content
G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
44.
HASH-BASED ALLOCATION OF APPLICATIONS TO VIRTUALIZED COMPUTING ENVIRONMENTS
Apparatuses, systems, and techniques for allocating application hosting platforms in a virtualized computing environment. A method can include assigning a first set of virtualized computing environments to a first application based on one or more characteristics of the first application, and assigning a second set of virtualized computing environments to a second application of the plurality of applications based on one or more characteristics of the second application, the second set of virtualized computing environments being different than the first set of virtualized computing environments. The method can include causing, in response to a request to execute the first application, an instance of the first application to be executed on a virtualized computing environment of the first set of virtualized computing environments using data stored in a cache of the virtualized computing environment.
In various examples, multi-speaker audio is diarized using artificial intelligence models including a sorting functionality. Sorting is performed based on the first time a speaker is indicated as speaking and/or based on the variance of a dimension of a speech embedding. Sorting speech sequences has the advantage of requiring fewer computations of cross-entropy loss during training and/or allowing diarization models to focus on the difference between speakers. Diarized speech may be used to create a transcript in conjunction with automatic speech recognition models.
In various examples, a technique for performing end-to-end navigation using a generative world model includes converting a set of sensory inputs received by a machine at a current time step into a set of embedded features. The technique also includes generating, via execution of one or more neural networks, one or more states associated with the current time step based at least on the set of embedded features, a history of states preceding the current time step, and a first set of actions associated with a previous time step. The technique further includes converting, via execution of the one or more neural networks, the one or more states into a set of predictions associated with the current time step, and performing, by the machine, a second set of actions associated with the current time step based on the set of predictions.
G05D 101/15 - Details of software or hardware architectures used for the control of position using artificial intelligence [AI] techniques using machine learning, e.g. neural networks
Apparatuses, systems, and techniques for synthesizing motion for three-dimensional (3D) assets. In at least one embodiment, a static 3D asset is obtained based on textual input. The static 3D asset is represented by a 3D spatial representation and an articulated skeleton embedded in the 3D spatial representation. The motion of the articulated skeleton is linked to the deformation of the 3D spatial representation. A sequence of skeleton configurations are generated based on the articulated skeleton, corresponding to a plurality of time steps. A corresponding 3D spatial representation configuration is generated for each skeleton configuration. Based on the sequence of 3D spatial representation configurations, a sequence of image frames are generated to provide a video. A video model evaluates the difference between the video and the textual input to provide a video evaluation loss, which is backpropagated to update the sequence of skeleton configurations.
A computer-implemented technique for training machine learning models includes processing one or more input images using a trained image generative model to generate one or more augmented images, where the trained image generative model generates each augmented image included in the one or more augmented images conditioned on an input image included in the one or more input images, depth information associated with the input image, semantic information associated with the input image, and text describing an augmentation to make to the input image; and performing, based on the one or more augmented images, one or more operations to train an untrained machine learning model to generate a trained machine learning model.
During the rendering of an image, specific pixels in the image are identified where antialiasing would be helpful. Antialiasing is then performed on these identified pixels, where antialiasing is a technique used to add greater realism to a digital image by smoothing jagged edges. This reduces a cost of performing antialiasing by reducing a number of pixels within an image on which antialiasing is performed.
Apparatuses, systems, and techniques to generate images of objects. In at least one embodiment, one or more neural networks are trained to identify one or more objects within one or more images, and the one or more neural networks are used to generate an image of one or more objects.
Apparatuses, systems, and techniques to process image data. In at least one embodiment, a neural network is trained to perform demosacing of two-dimensional image data obtained from an image sensor.
Apparatuses, systems, and techniques to generate code to be performed by one or more first processors based, at least in part, on one or more indications of data to be used by one or more second processors. In at least one embodiment, a CUDA program includes host code and device code, and a linker uses references for code elements in host code to link or prune code elements from device code.
Systems and methods for cooling a datacenter are disclosed. In at least one embodiment, a cold plate has microchannels and a heat pipe to support a first fluid in an active mode of operation of a cold plate that uses microchannels, and to support a second fluid in a passive mode of operation of a cold plate that uses a heat pipe.
Disclosed are apparatuses, systems, and techniques that may use machine learning for implementing speaker diarization. The techniques include obtaining a speaker embedding for various reference times of a speech and for various differently-sized time intervals, identifying a plurality of clusters, each cluster associated with a different speaker of the speech. The techniques further include computing, using the speaker embeddings, a set of embedding weights for various differently-sized time intervals, and identifying, using the computed set of the embedding weights, one or more speakers speaking at a respective reference time.
G10L 25/87 - Detection of discrete points within a voice signal
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer hardware for use in enhancing the performance of
data centers supporting software applications for scaling
and development of artificial intelligence models and
architectures; computer hardware for use in enhancing the
performance of data centers supporting software applications
using artificial intelligence for machine learning, deep
learning, natural language generation, statistical learning,
supervised learning, un-supervised learning, data mining,
predictive analytics, business intelligence and computer
vision; integrated circuits, semiconductors and computer
chipsets for use in enhancing the performance of data
centers supporting software applications for scaling and
development of artificial intelligence models and
architecture; integrated circuits, semiconductors and
computer chipsets for use in enhancing the performance of
data centers supporting software applications using
artificial intelligence for machine learning, deep learning,
natural language generation, statistical learning,
supervised learning, un-supervised learning, data mining,
predictive analytics, business intelligence and computer
vision; computer hardware; integrated circuits,
semiconductors and computer chipsets; graphics processing
units (GPUs); embedded processors for computers; computer
networking hardware; computer networking switches; computer
hardware for communication among central processing units
(CPUs); data processing units (DPUs); computer hardware for
enabling connections among central processing units (CPUs),
servers and data storage devices. Design and development of computer hardware for use in
enhancing the performance of data centers supporting
software applications for scaling and development of
artificial intelligence models and architecture; design and
development of computer hardware for use in enhancing the
performance of data centers supporting software applications
using artificial intelligence for machine learning, deep
learning, natural language generation, statistical learning,
supervised learning, un-supervised learning, data mining,
predictive analytics, business intelligence and computer
vision; design and development of integrated circuits,
semiconductors and computer chipsets for use in enhancing
the performance of data centers supporting software
applications for scaling and development of artificial
intelligence models and architecture; design and development
of integrated circuits, semiconductors and computer chipsets
for use in enhancing the performance of data centers
supporting software applications using artificial
intelligence for machine learning, deep learning, natural
language generation, statistical learning, supervised
learning, un-supervised learning, data mining, predictive
analytics, business intelligence and computer vision; design
and development of computer hardware; design and development
in the field of computer networking hardware; design and
development in the field of computer datacenter
architecture; design and development of computer hardware,
namely, integrated circuits, semiconductors, computer
chipsets, graphics processing units (GPUs), embedded
processors for computers, computer networking hardware,
computer networking switches, computer hardware for
communication among central processing units (CPUs), data
processing units (DPUs), and data storage systems, computer
hardware for enabling connections among central processing
units (CPUs), servers and data storage devices; computer
technical support services, namely, management and
optimization of software and hardware for data centers.
56.
PRE-FABRICATED PIN-BASED VERTICAL ELECTRICAL CONNECTIVITY IN A PACKAGE SUBSTRATE
A substrate is disclosed. In one embodiment, the substrate comprises a substrate core including a plurality of through holes located therethrough, a plurality of metal pins aligned in the plurality of through holes, and at least one layer deposited on at least one of top and bottom surfaces of the substrate core. In one embodiment, the plurality of metal pins are aligned with the plurality of through holes such that each of the plurality of metal pins extends at least to both the top and bottom surface of the substate core. In some embodiments, the deposited at least one layer is deposited after the plurality of metal pins have been aligned in the through holes of the substrate core.
H01L 23/538 - Arrangements for conducting electric current within the device in operation from one component to another the interconnection structure between a plurality of semiconductor chips being formed on, or in, insulating substrates
H01L 21/48 - Manufacture or treatment of parts, e.g. containers, prior to assembly of the devices, using processes not provided for in a single one of the groups or
H01L 23/14 - Mountings, e.g. non-detachable insulating substrates characterised by the material or its electrical properties
Electrical device including a substrate having a frontside surface and a backside surface and a back-side insulating layer with back-side metal tracks therein, the back-side insulating layer located on the backside surface. At least a portion of at least one of the back-side metal tracks is connected to a signal source to carry a global signal along the portion of the at least one of the back-side metal tracks towards a signal receiver. A method of manufacture including providing: substrate, forming back-side insulating layer on backside surface of substrate and back-side metal tracks therein. Providing a signal source, a signal receiver and connecting at least a portion of at least one of the back-side metal tracks to signal source the portion carrying global signal from the signal source along the portion of towards signal receiver.
In various examples, wavelet prediction-based image reconstruction for image processing systems and applications is provided. A deep learning model may use derived frequency bands to predict sub-pixel-level information to perform predictive resampling as well as image/video artifact removal. The model may learn to predict missing frequency components while removing artifacts to generate resampled resolution image predictions based on the original input image. The model may comprise distinct frequency domain and spatial domain paths. The frequency domain path may process frequency domain sub-band images to introduce individualized non-linearity. Spatial domain prediction data may be generated based on the upsampled original input image. Substantive corrections may be applied by mapping the spatial domain prediction data into frequency sub-band images and the correcting sub-band images based on frequency domain prediction data. The resulting corrected sub-band images may be applied to an inverse DWT to reconstruct a resampled version.
G06T 5/10 - Image enhancement or restoration using non-spatial domain filtering
G06T 3/4046 - Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
G06T 3/4053 - Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
G06T 3/4084 - Scaling of whole images or parts thereof, e.g. expanding or contracting in the transform domain, e.g. fast Fourier transform [FFT] domain scaling
G06T 5/60 - Image enhancement or restoration using machine learning, e.g. neural networks
59.
DYNAMIC BUFFER SIZING AND HARDWARE-ASSISTED FRAME PACING WHILE APPLICATION STREAMING
Various examples, systems, and methods are disclosed relating to buffer sizing and frame pacing. A first computing system decode an encoded bitstream of a data stream to extract a plurality of frames corresponding to a plurality of first presentation times. The first computing system can update a target buffer size of at least one buffer based on one or more deviations between one or more expected frame arrival times of the encoded bitstream and one or more actual frame arrival times. The first computing system can store the plurality of frames in the at least one buffer for scheduling presentation on a display. The first computing system can present the plurality of frames at a plurality of second presentation times on the display based on at least one tuning of the plurality of first presentation times responsive to an update in the target buffer size.
In various examples, systems and methods are disclosed relating to automated network infrastructure diagnostic operations using generative artificial intelligence. A system can classify at least one log of a set of logs produced by a network system as corresponding to a network anomaly. Upon classifying the at least one log as corresponding to the network anomaly, the system can generate, using a machine-learning model and the at least one log, a command to produce a message comprising natural language output identifying the network anomaly. The system can cause performance of one or more maintenance actions on the network system based on the message to address the network anomaly.
In various examples, a technique for generating simulation data includes generating, via one or more simulations, simulation data associated with operation of a first machine in an environment. The technique also includes determining a command to the first machine based at least on the simulation data and a goal associated with the first machine and updating the simulation data based at least on the command. The technique further includes storing the simulation data, the command, and the updated simulation data in one or more data records, and causing a second machine to perform one or more actions based at least on the one or more data records.
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
62.
VISUAL CHAIN-OF-THOUGHT REASONING FOR ROBOT VISION-LANGUAGE-ACTION MODELS
Apparatuses, systems, and techniques are disclosed for controlling a robot to execute a task. In at least one embodiment, a current image of the robot in an environment and a text describing the task are obtained. A future image of the robot in the environment is predicted based on the current image and the text. Subsequently, one or more actions are predicted based on the current image, the future image, and the text. The one or more actions can move the robot from a first state corresponding to the current image to a second state corresponding to the future image. The robot executes the sequence of actions to move in the environment.
The rise of specialized vision foundation models has created a need for methods to consolidate knowledge from multiple models (i.e. the teachers) into a single model (i.e. the student). However, this type of knowledge agglomeration leaves open several critical challenges, including that teacher models typically operate at varying resolutions due to different architectures and training goals, creating feature granularity inconsistencies, that existing models have different distribution moments which can result in biased learning, and that computer vision models are oftentimes trained to produce features at a particular resolution, and therefore do not generalize well to different tasks requiring different resolutions. The present disclosure provides multi-resolution and multi-teacher based training of a computer vision model, which can capture both fine details and broader abstractions from the teacher models, which can prevent biased learning among the teacher models, and which can produce a flexible computer vision model for different feature resolutions.
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/52 - Scale-space analysis, e.g. wavelet analysis
Various embodiments include techniques for inter-processor communication. The techniques include generating a relaxed store message that comprises a first epoch number, incrementing a first counter associated with the first epoch number, and transmitting the relaxed store message to a first directory included in a cache.
Various embodiments include techniques for inter-processor communication. The techniques include generating a relaxed store message that comprises a first epoch number, incrementing a first counter associated with the first epoch number, and transmitting the relaxed store message to a first directory included in a cache.
In various examples, live perception from sensors of a vehicle may be leveraged to generate potential paths for the vehicle to navigate an intersection in real-time or near real-time. For example, a deep neural network (DNN) may be trained to compute various outputs—such as heat maps corresponding to key points associated with the intersection, vector fields corresponding to directionality, heading, and offsets with respect to lanes, intensity maps corresponding to widths of lanes, and/or classifications corresponding to line segments of the intersection. The outputs may be decoded and/or otherwise post-processed to reconstruct an intersection—or key points corresponding thereto—and to determine proposed or potential paths for navigating the vehicle through the intersection.
G01C 21/26 - NavigationNavigational instruments not provided for in groups specially adapted for navigation in a road network
G05D 1/249 - Arrangements for determining position or orientation using signals provided by artificial sources external to the vehicle, e.g. navigation beacons from positioning sensors located off-board the vehicle, e.g. from cameras
G05D 1/437 - Control of position or course in two dimensions for aircraft during their ground movement
G06N 3/04 - Architecture, e.g. interconnection topology
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/46 - Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]Salient regional features
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
67.
OCCUPANCY PREDICTION USING FORWARD-BACKWARD VIEW TRANSFORMATION
Apparatuses, systems, and techniques of using one or more machine learning processes (e.g., neural network(s)) to predict occupancy using an image input. In at least one embodiment, image data is processed using a neural network to predict occupancy in a 3D voxel space. In at least one embodiment, image data is processed using a neural network to detect objects in a 3D space.
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
B60W 40/02 - Estimation or calculation of driving parameters for road vehicle drive control systems not related to the control of a particular sub-unit related to ambient conditions
G06T 3/00 - Geometric image transformations in the plane of the image
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
68.
Video prediction using one or more neural networks
Apparatuses, systems, and techniques to enhance video are disclosed. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having one or more additional video frames.
Apparatuses, systems, and techniques to perform an application programming interface (API) to indicate one or more graph node functions of one or more graph nodes to be added to a software graph based, at least in part, on a dependency type indicated by the API. In at least one embodiment, one or more graph nodes are added to a software graph based on a node type and a dependency type.
An integrated circuit (IC) is disclosed. In one embodiment, the IC comprises a memory and at least one processing core coupled to the memory. In one embodiment, the operations the at least one processing core operatively coupled to the memory perform comprise storing final hashes of each block chain authentication (BCA) section of an in-system test (IST) image in a write-protected portion of the memory; receiving at least one Public Key Authentication (PKA) signature; authenticating the at least one PKA signature using public key cryptography (PKC) algorithm; once the PKA signature is authenticated, using the final hash for each BCA section to authenticate a portion of the IST image associated with each BCA section using a hash engine of the at least one processing core skipping any PKA signature checks for any of the final hashes; and testing portions of the IC with the authenticated IST image portion.
G06F 21/16 - Program or content traceability, e.g. by watermarking
H04L 9/32 - Arrangements for secret or secure communicationsNetwork security protocols including means for verifying the identity or authority of a user of the system
Apparatuses, systems, and techniques to process image frames. In at least one embodiment, motion information of one or more pixels in one or more video frames is generated based, at least in part, on depth information of the one or more pixels.
Approaches presented herein provide systems and methods for content generation systems that incorporate a denoising diffusion generative adversarial network (DDGAN) into a content generation pipeline. A generator associated with the DDGAN may be conditioned on a set of input images that include at least a noisy image from a diffusion engine, an upsampled low resolution image, and a historical image. The generator may be used to generate an output image having one or more properties that are different from a content engine. Weights for the generator may be determined during a training process that includes a discriminator that evaluates at least the noisy image, the upsampled low resolution image, and a noised image produced from the output image.
G06T 3/4046 - Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
G06T 3/4053 - Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
73.
GENERATING CODING UNIT TREE STATISTICAL DATA TO ENHANCE PERFORMANCE AND COMPRESSION EFFICIENCY
Various embodiments include techniques for generating coding unit tree statistical data for video blocks included in a media frame. The disclosed video encoder includes a first-in-first-out (FIFO) memory for storing encoding data for multiple video blocks. A first set of units within the video encoder generates encoding data for each video block. After storing the encoding data in FIFO memory, the first set of units can proceed with encoding additional blocks of the media frame without having to wait for a second set of units within the video encoder to retrieve and process the encoding data. Subsequently, the second set of units can efficiently retrieve and process data for multiple blocks of the media frame at a time. Further, the video encoder can generate both forward looking prediction data and backward looking prediction data to encode a current block of the media frame, resulting in improved video quality.
H04N 19/423 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
74.
INCREASING BANDWIDTH OVER A COMMUNICATION CHANNEL INTERCONNECT WITH DATA REQUEST MODIFICATION
A includes a memory and one or more processing devices operatively coupled to the memory. The one or more processing devices to determine that a data request comprises first persistent data, remove the first persistent data from the data request to obtain first dynamic data, generate first modification data representing the first persistent data, combine the first dynamic data and first modification data to obtain a first modified data request, and cause the first modified data request to be transmitted to a second device over a communication link.
One embodiment of a method for code instrumentation. The method includes in response to determining that a first portion of machine code was not instrumented during compilation of the first portion of machine code: performing one or more operations to instrument at least one part of the first portion of machine code, and executing the first portion of machine code.
Approaches presented herein provide for the use of reinforcement learning to fine-tune a generative model, such as a motion diffusion model, for a specific objective, such as to generate representations of human motion corresponding to provided text input. A discriminator can be used to guide the training of the generative model. In at least one embodiment, the discriminator can compare the input text and generated motion representation (or embeddings of each) to determine an alignment value or match score, for example, which can then be used to adjust the network parameters or weights of the generative model to improve the alignment between input text and generated motion.
Various examples, systems, and methods are disclosed relating to frame selection via activity-based ranking and optimization. A first computing system can receive a plurality of frames and metadata from a capture device capturing a video stream. The first computing system can generate, using a ranking model, a plurality of rankings for the plurality of frames based on a plurality of video parameters of the plurality of frames and the metadata, wherein the plurality of rankings correspond to a summarization of the video stream. The first computing system can determine at least one of the plurality of frames to provide to at least one buffer based on the plurality of rankings, wherein the at least one buffer stores a subset of frames of the plurality of frames. The first computing system can provide, from the at least one buffer, the subset of frames as input to a machine-learning model.
G06F 16/783 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
78.
STORAGE INSTRUCTION FOR MATRIX MULTIPLY-ACCUMULATE OPERATIONS
Apparatuses, systems, and techniques to perform an instruction to use storage to store information to be used exclusively by one or more tensor operations. In at least one embodiment, a processor retrieves information from storage that exclusively stores matrix information in response to an instruction and performs a multiplication computation using said matrix information.
Apparatuses, systems, and techniques to perform an matrix multiply accumulate (MMA) instruction to cause a plurality of portions of an MMA operation to be performed using a corresponding plurality of MMA accelerators. In at least one embodiment, a processor retrieves a plurality of matrix information from a memory that exclusively stores and performs a multiplication computation using said matrix information.
Embodiments of the present disclosure relate to applications, platforms, architecture, etc. for using a master test image that may be used for multiple different tests. For example, a testing system may include a register bank that may be loaded with test configurations corresponding to one or more tests. The test configurations may respectively correspond to sets of control packets included in the master test image that may be used or executed for corresponding tests. The test configurations may indicate execution orders of their respective sets of control packets in which the execution order of one or more of the control packets included in the sets of control packets may differ from a default execution order of such control packets as indicated in the master test image. Such a configuration may accordingly allow for the flexibility of performing many different tests using a single master test image.
In various examples, systems and methods are disclosed relating to generating video streams for generative artificial intelligence models. A system can receive a plurality of frames from a capture device capturing a video stream. The system can determine that at least one frame of the plurality of frames is to be provided as input to a machine-learning model. The system can generate an indication that the at least one frame is to be provided as input to the machine-learning model. The system can generate an encoded bitstream for the video stream. The encoded bitstream can include encoded data for the plurality of frames and the indication.
H04N 19/184 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
H04N 19/139 - Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
The present disclosure relates to obtaining a data recording. The data recording may correspond to sensor data that includes frame data corresponding to one or more frames that depict a scene as represented by the frame data. The frame data of the one or more frames may be compared against an annotated dataset that may include known features and annotations corresponding to the known features. One or more features in the one or more frames may be identified based at least on the comparison between the frame data and the annotated dataset. A subset of the one or more frames including one or more features associated with one or more operational domains may be determined. Additionally, the subset of frames may be provided to a detection model as training data.
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06V 20/40 - ScenesScene-specific elements in video content
83.
MULTI-PROCESSING UNIT STATUS AGGREGATION AND TRANSMISSION
Systems and methods are directed toward collecting, aggregating, arbitrating, and transmitting data streams from one or more different source locations. An intermediary system may be positioned between a central controller and a variety of source locations to receive data streams from the different source locations along independent data connections. The intermediary system may identify information for transport to a central controller along a separate connection while delaying or otherwise managing the remaining incoming data streams. Upon determining transmission is complete, the intermediary system may then select another data stream for processing while continuing to delay or manage the remaining incoming data streams.
In various examples, techniques for automatically generating and maintaining data cards for datasets is described herein. Systems and methods are disclosed that process a dataset in order to identify relevant information associated with the dataset. For example, the dataset may include and/or be associated with sources of information-such as files, documents, links, memos, research papers, annotations, labels, and/or the like-that describe data instances (e.g., images, audio clips, point clouds, etc.) included in the dataset. These sources of information may then be analyzed to retrieve the relevant information associated with the dataset. Systems and methods are then further disclosed that may use one or more language models to process input data associated with the relevant information in order to generate a data card associated with the dataset.
Approaches presented herein provide for the automatic inspection of material element having an expected size, shape, and location. Such automatic analysis can be useful for inspecting the installation of an element, such as a patch of thermal interface material (TIM) attached to a heat sink of a graphics card. A digital image can be captured and cropped to a region of interest including an element to be inspected. A contour of the element can be identified and used to determine the location of a first edge of the element. Test lines can be swept across the area of the contour until at least one edge criterion is satisfied for additional edges of the element. The intersections of these edges can be identified and used as approximations of the corners or vertexes of the element. The coordinates of these elements, or values calculated therefrom, can be compared to expected coordinates from a reference standard to determine whether the element satisfies one or more inspection criteria.
Apparatuses, systems, methods, and techniques to obtain information transmitted in data packets and store data, based at least in part on the information, in GPU memory. In at least one embodiment, the information is obtained and stored in GPU memory without using a central processing unit (CPU). In at least one embodiment, a data processing unit (DPU) recieves incoming data packets, stores information based at least in part on the data packets in DPU memory, and intiates a transfer directly to GPU memory. In at least one embodiment, the data packets are TCP data packets.
Traditionally, a software application is developed, tested, and then published for use by end users. Any subsequent update made to the software application is generally in the form of a human programmed modification made to the code in the software application itself, and further only becomes usable once tested, published, and installed by end users having the previous version of the software application. This typical software application lifecycle causes delays in not only generating improvements to software applications, but also to those improvements being made accessible to end users. To help avoid these delays and improve performance of software applications, deep learning models may be made accessible to the software applications for use in providing inferenced data to the software applications, which the software applications may then use as desired. These deep learning models can furthermore be improved independently of the software applications using manual and/or automated processes.
A semiconductor device includes a scan data output and a scan chain having length n. Scan data bits appear sequentially at the scan data output responsive to a scan clock when the device is in a scan mode. Initial masking circuitry is operable to obscure, responsive to the device transitioning from a non-scan mode to the scan mode, a first n bits of the scan data appearing at the scan data output and not to obscure n+1st and subsequent bits of the scan data appearing at the scan data output. Infinite masking circuitry is operable to obscure, responsive to an infinite masking trigger, all scan data bits appearing at the scan data output until a reset of the device occurs. Monitoring and security enforcement circuitry is operable to detect a configuration change and to generate the infinite masking trigger if the detected change corresponds to a potential security risk.
Systems utilizing a vision-language model configured with a training dataset that includes images labeled with spatial question-and-answer pairs, the question-and-answer pairs encoding object-to-object relationships and object-to-space relationships depicted in the images, and at least one data processor configured to operate the vision-language model to carry out a robotic task.
A system includes a processing core, a transmission driver coupled to the processing core and to a channel, the transmission driver including an inverter and a capacitor coupled in series to the channel. A bypass switch is coupled across the capacitor in response to a bypass enable signal from the processing core. The processing core is configured to determine that the transmission driver is to exit a transmission mode and cause, via the bypass enable signal, the bypass switch to be closed. The processing core is configured to trigger the transmission driver to cause a voltage of the channel to at least satisfy a first threshold value.
H03K 17/56 - Electronic switching or gating, i.e. not by contact-making and -breaking characterised by the use of specified components by the use, as active elements, of semiconductor devices
H04B 1/401 - Circuits for selecting or indicating operating mode
91.
IMPLEMENTING SPECIALIZED FLOATING POINT INSTRUCTIONS ON AN INTEGER PIPELINE FOR ACCELERATING DYNAMIC PROGRAMMING ALGORITHMS
Various techniques for accelerating dynamic programming algorithms are provided. For example, a fused addition and comparison instruction, a three-operand comparison instruction, and a two-operand comparison instruction are used to accelerate a Needleman-Wunsch algorithm that determines an optimized global alignment of subsequences over two entire sequences. In another example, the fused addition and comparison instruction is used in an innermost loop of a Floyd-Warshall algorithm to reduce the number of instructions required to determine shortest paths between pairs of vertices in a graph. In another example, a two-way single instruction multiple data (SIMD) floating point variant of the three-operand comparison instruction is used to reduce the number of instructions required to determine the median of an array of floating point values.
Apparatuses, systems, and techniques to facilitate memory management. In at least one embodiment, an application programming interface is performed to cause physical memory corresponding to shared virtual memory to be designated for use by a plurality of processors.
In various examples, geometries associated with one or more paths in an environment may be efficiently tracked and/or predicted using recursive models. For instance, the disclosed systems and methods may use Kalman filters to track and predict control points corresponding to Bezier curves (e.g., 2D and/or 3D Bezier curves). The Bezier curves may be representative of geometries associated with one or more lanes of a driving surface. In some instances, multiple Bezier curves may be used to represent a geometry of a lane, and multiple Kalman filters may be used to track and predict control points for each Bezier curve. For instance, an edge of the lane may be represented using a first Bezier curve, and control points for the first Bezier curve may be tracked and predicted using multiple Kalman filters (e.g., for a 3D Bezier curve, one Kalman filter for each x, y, or z dimension).
Approaches presented herein provide for the training and use of generative models to generate high quality image data for novel reconstruction views. A generative model such as a neural radiance field (NeRF) can be trained to generate such content. In order to train the NeRF to represent a specific scene with high accuracy, the NeRF can be trained using a diffusion model for score distillation guidance. The diffusion model can be trained using a large set of environment data from a variety of different views, then fine-tuned for a specific domain and/or scene. A parameter-efficient training process can be used to avoid overfitting of the diffusion model to the domain- or scene-specific training data. Once fine-tuned, the “expert” diffusion model can be used with the NeRF during training to effectively transfer the expert knowledge to the NeRF, enabling the NeRF to generate high quality image data for the scene from viewpoints corresponding to extreme novel views.
Sorting data in memory is a fundamental computation that facilitates a wide range of search and query problems, aids in the construction and manipulation of data structures, and can improve the spatial and temporal locality of data and computation. Oftentimes, merge-based designs are used for sorting data, where a block sorting pass is performed followed by merging passes that produce increasingly larger sorted sublists until only a single list remains. While conventional merge-based designs can support multiple (K) merge operations in a single pass, a balance must be struck since there exists a point where K-scaling is no longer profitable. The present disclosure provides an alternative merge-based design in which data is merged using a single feed-forward data path.
Approaches presented herein provide for the generation of relatively small language models that are optimized for target languages. A multilingual large language model (LLM) can be reduced in size using a process such as language-aware pruning, where individual network parameters have importance scores calculated with respect to the target language and then an appropriate number of lower-importance score parameters are removed from the network. Continued pretraining can be performed using a set of training data including real and/or synthesized text in the target language, to obtain a high performing language model with a limited number of parameters optimized for a target language, as may correspond to a lower-resource language that may otherwise not have enough training data available to sufficiently train a language model from scratch.
A first transformer includes a first primary winding and a first secondary winding. The first primary winding is coupled to a first phase branch to produce a first waveform. The first secondary winding is serially coupled to a compensation branch. A second transformer includes a second primary winding and a second secondary winding. The second primary winding is coupled to a second phase branch to produce a second waveform. The second secondary winding is serially coupled to the first secondary winding. An output node is coupled to the first phase branch and the second phase branch to provide an output waveform. The output waveform includes the first waveform and the second waveform, and has a transient response based on a first harmonic factor of the first phase branch and a second harmonic factor of the second phase branch.
H02M 3/158 - Conversion of DC power input into DC power output without intermediate conversion into AC by static converters using discharge tubes with control electrode or semiconductor devices with control electrode using devices of a triode or transistor type requiring continuous application of a control signal using semiconductor devices only with automatic control of output voltage or current, e.g. switching regulators including plural semiconductor devices as final control devices for a single load
H02M 1/088 - Circuits specially adapted for the generation of control voltages for semiconductor devices incorporated in static converters for the simultaneous control of series or parallel connected semiconductor devices
H02M 1/14 - Arrangements for reducing ripples from DC input or output
98.
TRANSIENT VOLTAGE LIMITING REGULATOR (TLVR) WITH VARIABLE INDUCTOR
A first primary winding of a first transformer is coupled to a first phase branch to produce a first waveform. A second primary winding of a second transformer is coupled to a second phase branch to produce a second waveform. A first secondary winding of the first transformer and a second secondary winding of the second transformer are serially coupled to a compensation branch. The compensation branch includes a variable inductor having an inductance that corresponds to a threshold current. An output node is coupled to the first phase branch and the second phase branch to provide an output waveform. The output waveform includes the first waveform and the second waveform, and has a transient response based on a first harmonic factor of the first phase branch and a second harmonic factor of the second phase branch.
H02M 3/158 - Conversion of DC power input into DC power output without intermediate conversion into AC by static converters using discharge tubes with control electrode or semiconductor devices with control electrode using devices of a triode or transistor type requiring continuous application of a control signal using semiconductor devices only with automatic control of output voltage or current, e.g. switching regulators including plural semiconductor devices as final control devices for a single load
H02M 1/088 - Circuits specially adapted for the generation of control voltages for semiconductor devices incorporated in static converters for the simultaneous control of series or parallel connected semiconductor devices
H02M 1/14 - Arrangements for reducing ripples from DC input or output
99.
EXTRACTING CONNECTED OBJECT COMPONENTS FOR DEDUPLICATION IN THREE DIMENSIONAL CONTENT CREATION AND SCENE RENDERING
Approaches presented herein provide systems and methods to identify connected components within an object using an assigned label. The assigned label may be propagated through a number of connected components to identify a number of components having the same assigned label. Index values for the assigned labels may also be identified within a memory buffer to determine a range within the buffer for the connected component and to group and store each of the individual component within a contiguous sequence.
Approaches presented herein provide systems and methods to reuse a rendered image for noising and denoising steps used for training one or more content generation systems. The reused rendered image may reduce computationally expensive processes, such as content generation and rendering, and enable multiple gradients to be compared using a common image that may be noised and then processed by one or more diffusion models to compute a gradient. The gradients may be combined and used to retrain the model, providing more training data with less variance between generating and rendering steps.