An integrated circuit (IC) includes a clock modulation circuitry including a delay hierarchy circuitry coupled to the register, the delay hierarchy circuitry configured to receive a clock (CLK) signal, provide a delayed master clock (CLKM) signal to a master latch of the register, and provide a delayed slave clock (CLKS) signal to a slave latch of the register.
Embodiments herein describe a circuit including a user domain configured to execute user functions and a hardened domain configured to communicate with the user domain. The hardened domain includes peripheral component interconnect express (PCIe) function decoding logic having a plurality of register bits and a Trusted Execution Environment (TEE) Device Interface Security Protocol (TDISP) core communicating with the PCIe function decoding logic. The TDISP core supports a plurality of PCIe functions. Each register bit of the plurality of register bits is assigned to a respective PCIe function of the plurality of PCIe functions.
G06F 13/42 - Protocole de transfert pour bus, p. ex. liaisonSynchronisation
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/455 - ÉmulationInterprétationSimulation de logiciel, p. ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
Embodiments herein describe a circuit including a passive intermodulation (PIM) model circuit configured to process first data to generate a PIM interference model output to be concatenated with second data, the second data including a first carrier frequency and a second carrier frequency, and the circuit further including a PIM model adapt circuit configured to receive frequency shifted captured data and frequency shifted PIM models to generate updated values to compensate for PIM interference after the PIM interference model output is concatenated with the second data.
H04B 1/525 - Dispositions hybrides, c.-à-d. dispositions pour la transition d’une transmission bilatérale sur une voie à une transmission unidirectionnelle sur chacune des deux voies ou vice versa avec des moyens de réduction de la fuite du signal de l’émetteur vers le récepteur
H04B 1/00 - Détails des systèmes de transmission, non couverts par l'un des groupes Détails des systèmes de transmission non caractérisés par le milieu utilisé pour la transmission
Embodiments herein describe a computer architecture including at least one core including a first cache and a second cache, a shared cache, and an accelerator comprising circuitry configured to manage data and instructions transferred between the first and second caches and the shared cache, wherein the accelerator platform is configured to allow an implementation of a user task to perform multi-level prefetching to timely obtain address translation mappings. Address translation mappings are mappings between virtual addresses and physical addresses stored in a page table. The multi-level prefetching includes a first prefetching request (far request), a second prefetching request (near request), and a third prefetching request (now request).
G06F 12/0862 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec pré-lecture
G06F 12/084 - Systèmes de mémoire cache multi-utilisateurs, multiprocesseurs ou multitraitement avec mémoire cache partagée
G06F 12/0873 - Mappage de mémoire de mémoire cache vers des dispositifs ou des parties de dispositifs de stockage
G06F 12/1009 - Traduction d'adresses avec tables de pages, p. ex. structures de table de page
5.
POWER REDUCTION IN AN ARRAY OF DATA PROCESSING ENGINES
Embodiments herein describe a hardware accelerator that includes multiple power or clock domains. For example, the hardware accelerator can include an array of data processing engines (DPEs) where different subsets of the DPEs (e.g., different columns, rows, or blocks) are disposed in different power or clock domains within the hardware accelerator. When one or more subsets of the DPEs are idle (e.g., the hardware accelerator has not assigned any tasks to those DPEs), the accelerator can deactivate the corresponding power or clock domain (or domains), which deactivates the DPEs in those domains while the DPEs in the other power or clock domains remain operational. As such, idle DPEs can be deactivated to conserve energy while DPEs with work can remain operational.
G06F 1/3237 - Économie d’énergie caractérisée par l'action entreprise par désactivation de la génération ou de la distribution du signal d’horloge
G06F 1/04 - Génération ou distribution de signaux d'horloge ou de signaux dérivés directement de ceux-ci
G06F 1/3228 - Surveillance d’exécution de tâches, p. ex. par utilisation de temporisations d’attente, de commandes d’arrêt ou de commandes d’attente
G06F 1/3234 - Économie d’énergie caractérisée par l'action entreprise
G06F 1/3287 - Économie d’énergie caractérisée par l'action entreprise par la mise hors tension d’une unité fonctionnelle individuelle dans un ordinateur
6.
CONTROLLER FOR AN ARRAY OF DATA PROCESSING ENGINES
Embodiments herein describe integrating an accelerator into a same SoC (or same chip or IC) as a CPU. The SoC also includes a controller (e.g., a microcontroller) that orchestrates data processing engines (DPEs) in the accelerator. The controller (or orchestrator) receives a task from the CPU and then configures the DPEs to perform the task. For example, the controller may divide the task into a sequence of operations that are performed by one or more of the DPEs. The controller can then report back to the CPU when the task is complete.
G06F 9/48 - Lancement de programmes Commutation de programmes, p. ex. par interruption
G06F 12/1027 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB]
Embodiments herein describe integrating an AI accelerator into a same SoC (or same chip or IC) as a CPU. Thus, instead of relying on off-chip communication techniques, on-chip communication techniques such as an interconnect (e.g., a NoC) can be used to facilitate communication. This can result in faster communication between the AI accelerator and the CPU. Moreover, a tighter integration between the CPU and AI accelerator can make it easier for the CPU to offload AI tasks to the Al accelerator. In one embodiment, the AI accelerator includes address translation circuitry for translating virtual addresses used in the AI accelerator to physical addresses used to store the data.
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 12/1036 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB] pour espaces adresse virtuels multiples, p. ex. segmentation
8.
AN AREA AND POWER EFFICIENT CLOCK DATA RECOVERY (CDR) AND ADAPTATION IMPLEMENTATION FOR DENSE WAVELENGTH-DIVISION MULTIPLEXING (DWDM) OPTICAL LINKS
Embodiments herein describe techniques for area and power efficient clock data recovery (CDR) and adaptation implementations for dense wavelength-division multiplexing (DWDM) optical links and other types of links. One example is a system that includes a plurality of receiver circuits that sample signals based on respective receiver clocks, where the receiver circuits include a reference receiver circuit and remaining receiver circuits, and where the receiver clock of the reference receiver circuit comprises a reference clock. The system further includes a clock and data recovery (CDR) circuit that controls a phase of the reference clock based on outputs of the reference receiver circuit, and time-multiplexed de-skew circuitry configured to determine time-multiplexed phase offsets for the remaining receiver circuits based on time-multiplexed outputs of the remaining receiver circuits, where the remaining receiver circuits phase-shift the reference clock based on the respective time- multiplexed phase offsets to provide the respective receiver clocks.
H04J 14/02 - Systèmes multiplex à division de longueur d'onde
G02B 6/12 - Guides de lumièreDétails de structure de dispositions comprenant des guides de lumière et d'autres éléments optiques, p. ex. des moyens de couplage du type guide d'ondes optiques du genre à circuit intégré
9.
CONTROLLER FOR AN ARRAY OF DATA PROCESSING ENGINES
Embodiments herein describe integrating an accelerator into a same SoC (or same chip or IC) as a CPU. The SoC also includes a controller (e.g., a microcontroller) that orchestrates data processing engines (DPEs) in the accelerator. The controller (or orchestrator) receives a task from the CPU and then configures the DPEs to perform the task. For example, the controller may divide the task into a sequence of operations that are performed by one or more of the DPEs. The controller can then report back to the CPU when the task is complete.
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 13/42 - Protocole de transfert pour bus, p. ex. liaisonSynchronisation
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
G06F 12/1036 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB] pour espaces adresse virtuels multiples, p. ex. segmentation
G06F 12/0831 - Protocoles de cohérence de mémoire cache à l’aide d’un schéma de bus, p. ex. avec moyen de contrôle ou de surveillance
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
Embodiments herein describe a hardware accelerator that includes multiple power or clock domains. For example, the hardware accelerator can include data processing engines (DPEs) which include circuitry for performing acceleration tasks (e.g., artificial intelligence (AI) tasks, data encryption tasks, data compression tasks, and the like). The DPEs are interconnected to permit them to share data when performing the acceleration tasks. In addition to the DPEs, the hardware accelerator can include other circuitry such as an interconnect, a controller, address translation circuitry, etc. The DPEs may be in a first power or clock domain while the other circuitry is in a second power or clock domain. That way, when the DPEs are idle (e.g., the hardware accelerator currently has no tasks assigned to it), the first power or clock domain can be powered down while the second power or clock domain can remain powered.
Methods and systems for generating missing reference pixels for intra prediction of coding units are described. A pattern amongst a plurality of available reference pixel samples from a set of reference pixel samples is computed. The pattern can be determined based on a computed difference between actual pixel values of the available reference pixel samples. The patterns are learned based on a comparison of the computed difference between the actual pixel values to a predetermined threshold. The unavailable pixel values are then generated based on the learned pattern. Further, one or more image effects corresponding to the available reference pixel samples are automatically replicated in the generated pixels as well.
H04N 19/105 - Sélection de l’unité de référence pour la prédiction dans un mode de codage ou de prédiction choisi, p. ex. choix adaptatif de la position et du nombre de pixels utilisés pour la prédiction
H04N 19/132 - Échantillonnage, masquage ou troncature d’unités de codage, p. ex. ré-échantillonnage adaptatif, saut de trames, interpolation de trames ou masquage de coefficients haute fréquence de transformée
H04N 19/136 - Caractéristiques ou propriétés du signal vidéo entrant
H04N 19/159 - Type de prédiction, p. ex. prédiction intra-trame, inter-trame ou de trame bidirectionnelle
H04N 19/182 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’unité de codage, c.-à-d. la partie structurelle ou sémantique du signal vidéo étant l’objet ou le sujet du codage adaptatif l’unité étant un pixel
H04N 19/593 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage prédictif mettant en œuvre des techniques de prédiction spatiale
A transceiver circuit is disclosed, the transceiver circuit including a first register circuit, configured to receive serial stimulus data and to generate multi-bit parallel stimulus data, a serializer circuit configured to receive the multi-bit parallel stimulus data and to generate serialized data based on the multi-bit parallel stimulus data, where the serializer circuit includes a serializer data storage device, and where the serializer data storage device lacks circuit structures for scanability, a deserializer circuit configured to receive serial receiver data corresponding with the serialized data and to generate multi-bit parallel response data based on the serial receiver data, where the deserializer circuit includes a deserializer data storage device, and where the deserializer data storage device lacks circuit structures for scanability, and a second register circuit, configured to receive the multi-bit parallel response data and to generate serial response data.
Embodiments herein describe techniques for area and power efficient clock data recovery (CDR) and adaptation implementations for dense wavelength-division multiplexing (DWDM) optical links and other types of links. One example is a system that includes a plurality of receiver circuits that sample signals based on respective receiver clocks, where the receiver circuits include a reference receiver circuit and remaining receiver circuits, and where the receiver clock of the reference receiver circuit comprises a reference clock. The system further includes a clock and data recovery (CDR) circuit that controls a phase of the reference clock based on outputs of the reference receiver circuit, and time-multiplexed de-skew circuitry configured to determine time-multiplexed phase offsets for the remaining receiver circuits based on time-multiplexed outputs of the remaining receiver circuits, where the remaining receiver circuits phase-shift the reference clock based on the respective time-multiplexed phase offsets to provide the respective receiver clocks.
H04L 7/00 - Dispositions pour synchroniser le récepteur avec l'émetteur
H04L 1/00 - Dispositions pour détecter ou empêcher les erreurs dans l'information reçue
H04L 7/02 - Commande de vitesse ou de phase au moyen des signaux de code reçus, les signaux ne contenant aucune information de synchronisation particulière
Described herein are systems and methods for scalable communications. A circuit can receive a request from an application to communicate with a destination over a network. The circuit can identify the destination from information included in the request. In a first case that resources have been allocated for communicating with the destination identified from the request, the circuit can communicate data to the destination over the network using the resources that have been allocated. In a second case that resources have not been allocated for communicating with the destination identified from the request, the circuit can allocate resources for communicating the data with the destination. The circuit can communicate the data to the destination over the network using the resources that have been allocated.
H04L 47/76 - Contrôle d'admissionAllocation des ressources en utilisant l'allocation dynamique des ressources, p. ex. renégociation en cours d'appel sur requête de l'utilisateur ou sur requête du réseau en réponse à des changements dans les conditions du réseau
H04L 67/1097 - Protocoles dans lesquels une application est distribuée parmi les nœuds du réseau pour le stockage distribué de données dans des réseaux, p. ex. dispositions de transport pour le système de fichiers réseau [NFS], réseaux de stockage [SAN] ou stockage en réseau [NAS]
Embodiments herein describe integrating an Al accelerator into a same SoC (or same chip or IC) as a CPU. Thus, instead of relying on off-chip communication techniques, on-chip communication techniques such as an interconnect (e.g., a NoC) can be used to facilitate communication. This can result in faster communication between the Al accelerator and the CPU. Moreover, a tighter integration between the CPU and Al accelerator can make it easier for the CPU to offload Al tasks to the Al accelerator. In one embodiment, the Ai accelerator includes address translation circuitry for translating virtual addresses used in the Al accelerator to physical addresses used to store the data.
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 12/1027 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB]
G06F 12/1081 - Traduction d'adresses pour accès périphérique à la mémoire principale, p. ex. accès direct en mémoire [DMA]
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
G06F 13/42 - Protocole de transfert pour bus, p. ex. liaisonSynchronisation
G06N 3/063 - Réalisation physique, c.-à-d. mise en œuvre matérielle de réseaux neuronaux, de neurones ou de parties de neurone utilisant des moyens électroniques
16.
SYSTEMS AND METHODS FOR DECENTRALIZED ADDRESS TRANSLATION
The disclosed computer-implemented method for decentralized address translation can include receiving, by at least one processor implemented outside a processor core, a virtual address translation request. The method can additionally include, retrieving, by the at least one processor and in response to the virtual address translation request, a physical address. The method can also include returning, by the at least one processor, the physical address. Various other methods, systems, and computer-readable media are also disclosed.
G06F 12/1045 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB] associée à une mémoire cache de données
G06F 12/0897 - Mémoires cache caractérisées par leur organisation ou leur structure avec plusieurs niveaux de hiérarchie de mémoire cache
17.
MITIGATION OF CONTROL SET PACKING RESTRICTIONS FOR INTEGRATED CIRCUITS
Mitigation of controls set packing includes generating an Observability Don't Care (ODC) expression for a target register of a circuit design. The target register has an original reset signal that is a constant. A plurality of supports of the ODC expression that are driven by driver registers are grouped into a plurality of groups. Each group of the plurality of groups includes only supports that are driven by driver registers having a same reset signal. A control set of each group is different from a control set of the target register. The reset signal of a selected group of the plurality of groups is designated as a candidate reset signal for the target register based on an evaluation of the ODC expression. The circuit design is modified by connecting the candidate reset signal to the target register in place of the original reset signal.
Disclosed approaches for rendering event data from subsystems in different clock domains according to a system-level timeline include, for each of multiple subsystems, sampling a system timer in a first clock domain for a first timestamp by a host processor. A host processor requests a subsystem timestamp from a subsystem timer in each of the subsystems. The subsystem timestamp is associated with the first timestamp, and the subsystem timer operates in a clock domain different from the first clock domain. The host processor translates timestamps in traced event data of the subsystems to a timeline of the system timer using the first timestamp and associated subsystem timestamps.
A network-on-chip (NoC) includes a switch. The switch includes a first sub-switch, a second sub-switch, and a synchronization channel coupled to the first sub-switch and the second sub-switch. The first sub-switch and the second sub-switch are coupled to corresponding sub-switches in at least one other switch included in the NoC. Each of the first sub-switch and the second sub-switch includes ports in north, south, east, and west directions. The first sub-switch and the second sub-switch exchange flits of data through an additional port of the first sub-switch coupled to an additional port of the second sub-switch.
H04L 49/109 - Éléments de commutation de paquets caractérisés par la construction de la matrice de commutation intégrés sur micropuce, p. ex. interrupteurs sur puce
20.
SYSTEMS AND METHODS FOR PARALLELIZATION OF EMBEDDING OPERATIONS
A disclosed method may include initializing a deep learning recommendation model (DLRM) comprising a plurality of embedding tables, each embedding table comprising a plurality of embeddings. The method may also include receiving input data associated with accessing embeddings from the plurality of embedding tables and applying a parallelization strategy to process the plurality of embedding tables, the parallelization strategy configured to improve performance by distributing computational workloads and optimizing memory access. The method may also include processing the embeddings based on the input data in accordance with the parallelization strategy, the processing comprising aggregating embeddings accessed from the plurality of embedding tables. The method may also include generating, for further processing, output data based on the processed embeddings. Various other methods, systems, and computer-readable media are also disclosed.
A method includes a method includes receiving, by a compiler of a host of a computing system, input code, generating, by the compiler, pipelined input code by adding first tokens in a loop iteration argument field of a loop in the input code to pipeline the loop, the first tokens configured to sequentialize and serialize loop operations, a quantity of the first tokens based on a quantity of pipeline stages, and providing, by the host, the pipelined input code to a controller of an integrated circuit (IC) of the computing system.
Techniques for substrate noise isolation structures for electronic devices are provided. The disclosed techniques greatly reduce substrate noise induced by circuits in integrated circuits (ICs) that include Fin Field Effect Transistors (FinFETs). In an example, an electronic device is provided that includes a first circuit and a second circuit formed on a substrate, a first guard structure formed in the substrate, and a plurality of vias formed through the substrate. The first guard structure formed in the substrate is disposed between the first circuit and the second circuit. The plurality of vias formed through the substrate contact the first guard structure.
H01L 21/768 - Fixation d'interconnexions servant à conduire le courant entre des composants distincts à l'intérieur du dispositif
H01L 23/48 - Dispositions pour conduire le courant électrique vers le ou hors du corps à l'état solide pendant son fonctionnement, p. ex. fils de connexion ou bornes
H01L 23/522 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre comprenant des interconnexions externes formées d'une structure multicouche de couches conductrices et isolantes inséparables du corps semi-conducteur sur lequel elles ont été déposées
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
Disclosed herein are thermal management devices and electronic devices that utilized a plurality of plunger assemblies to route heat efficiently from chip packages. In some examples, the thermal management devices may also be used in electronic devices to route heat efficiently from power delivery layer residing below chip packages. In one example, a thermal management device is provided that includes a plurality of plunger assemblies retained to a metal plate. Each plunger assembly includes a metal body extending normally through an aperture formed between first and second sides of the plate and a spring biasing a distal end of the metal body away from the second side of the plate.
Embodiments herein describe a method for selectively filtering different wavelengths of optical signals received from an optical channel using cascaded ring resonators, each of the cascaded ring resonators having a first ring and a second ring. The first ring has a varying waveguide width along its length configured to form a first waveguide width portion and a second waveguide width portion, the first waveguide width portion having a greater width than the second waveguide width portion. The second ring has a varying waveguide width along its length configured to form a third waveguide width portion and a fourth waveguide width portion, the fourth waveguide width portion having a greater width than the third waveguide width portion. The method further connects receivers to respective cascaded ring resonators, each of the receivers having a photodetector configured to differentiate between the optical signals.
G02B 6/293 - Moyens de couplage optique ayant des bus de données, c.-à-d. plusieurs guides d'ondes interconnectés et assurant un système bidirectionnel par nature en mélangeant et divisant les signaux avec des moyens de sélection de la longueur d'onde
Disclosed herein are thermal management devices and electronic devices that utilized a plurality of plunger assemblies to route heat efficiently from chip packages. In some examples, the thermal management devices may also be used in electronic devices to route heat efficiently from power delivery layer residing below chip packages. In one example, a thermal management device is provided that includes a plurality of plunger assemblies retained to a metal plate. Each plunger assembly includes a metal body extending normally through an aperture formed between first and second sides of the plate and a spring biasing a distal end of the metal body away from the second side of the plate.
H01L 23/46 - Dispositions pour le refroidissement, le chauffage, la ventilation ou la compensation de la température impliquant le transfert de chaleur par des fluides en circulation
H01L 23/433 - Pièces auxiliaires caractérisées par leur forme, p. ex. pistons
H01L 23/52 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
A method for probing power contact pads on an integrated circuit (IC) die are disclosed. The method includes depositing a probing bump over multiple vias. The vias may be directly exposed or include an exposed contact pad. The method also includes forming a probing bump over and in electric contact with multiple vias. Optionally, the probing bump may be removed after probing.
Examples herein describe polynomial root search circuitry. The polynomial root search circuitry includes a search circuit configured to identify distinct roots of a first locator polynomial using parallel processing elements. A first subset of the parallel processing elements is configured to output terms of a second locator polynomial based on a first candidate root of the second locator polynomial. A second subset of the parallel processing elements is configured to output the terms of the second locator polynomial based on a second candidate root of the second locator polynomial.
G06F 7/556 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs n'établissant pas de contact, p. ex. tube, dispositif à l'état solideMéthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul de fonctions logarithmiques ou exponentielles
G06F 7/552 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs n'établissant pas de contact, p. ex. tube, dispositif à l'état solideMéthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul de puissances ou racines
28.
Determining quantization scale factors for layers of a machine learning model
Approaches for determining quantization scale factors include generating a population of chromosomes. Each chromosome has multiple genes, and each gene specifies a scale factor associated with a layer of a machine learning model. The population of chromosomes are evaluated, and the evaluating includes, for each chromosome in the population, quantizing floating point weights and floating point values of a representative dataset using the scale factors of the chromosome to produce quantized weights and a quantized dataset in the memory arrangement, initiating processing of the quantized dataset using the quantized weights according to the machine learning model, and gauging a level of accuracy of results produced by the processing of the quantized dataset. Satisfaction of termination criteria is determined based the levels of accuracy associated with the chromosomes in the population. The population of chromosomes is evolved and the evaluating repeated in response to the termination criteria not being satisfied.
Embodiments herein describe a method for selectively filtering different wavelengths of optical signals received from an optical channel using cascaded ring resonators, each of the cascaded ring resonators having a first ring and a second ring. The first ring has a varying waveguide width along its length configured to form a first waveguide width portion and a second waveguide width portion, the first waveguide width portion having a greater width than the second waveguide width portion. The second ring has a varying waveguide width along its length configured to form a third waveguide width portion and a fourth waveguide width portion, the fourth waveguide width portion having a greater width than the third waveguide width portion. The method further connects receivers to respective cascaded ring resonators, each of the receivers having a photodetector configured to differentiate between the optical signals.
G02B 6/293 - Moyens de couplage optique ayant des bus de données, c.-à-d. plusieurs guides d'ondes interconnectés et assurant un système bidirectionnel par nature en mélangeant et divisant les signaux avec des moyens de sélection de la longueur d'onde
30.
MODULAR INTERCONNECT FOR AN INTEGRATED CIRCUIT DEVICE
An integrated circuit device includes a network-on-chip (NoC). Connections for the NoC are generated from a circuit design for the corresponding integrated circuit device. Connections within the NoC are generated by analyzing the circuit design to detect a first connection attribute. The first connection attribute defines a first NoC master unit (NMU) and a first NoC slave unit (NSU). Further, a first NoC configuration is generated. The first NoC configuration includes the connections determined based on the first NMU and the first NSU.
Embodiments herein describe a multiple die system that includes an interposer that connects a first die to a second die. Each die has a bump interface structure that is connected to the other structure using traces in the interposer. However, the bump interface structures may have different orientations relative to each other, or one of the interface structures defines fewer signals than the other. Directly connecting the corresponding signals defined by the structures to each other may be impossible to do in the interposer, or make the interposer too costly. Instead, the embodiments here simplify routing in the interposer by connecting the signals in the bump interface structures in a way that simplifies the routing but jumbles the signals. The jumbled signals can then be corrected using reordering circuitry in the dies (e.g., in the link layer and physical layer).
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
G11C 5/06 - Dispositions pour interconnecter électriquement des éléments d'emmagasinage
H01L 23/538 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre la structure d'interconnexion entre une pluralité de puces semi-conductrices se trouvant au-dessus ou à l'intérieur de substrats isolants
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
32.
Dynamic adjustment of floating point exponent bias for exponent compression
Approaches for compressing exponents of floating point values include accumulating a distribution of values of exponents of the first set of floating point values, and compressing the exponents of the first set of floating point values into a compressed exponent bit-width as a function of a compressed exponent bias. The compressed exponent bit-width and the compressed exponent bias are adjusted based on the distribution of values of exponents of the first set of floating point values. The distribution of values of exponents of the first set of floating point values is accumulated with values of exponents of a second set of floating point values that is input in subsequent time period. The exponents of second set of floating point values are compressed into the compressed exponent bit-width as a function of the compressed exponent bias after the adjusting of the compressed exponent bit-width and the compressed exponent bias.
G06F 7/483 - Calculs avec des nombres représentés par une combinaison non linéaire de nombres codés, p. ex. nombres rationnels, système de numération logarithmique ou nombres à virgule flottante
G06F 7/499 - Maniement de valeur ou d'exception, p. ex. arrondi ou dépassement
H03M 7/24 - Conversion en, ou à partir de codes à virgule flottante
H03M 7/30 - CompressionExpansionÉlimination de données inutiles, p. ex. réduction de redondance
33.
PROTECTION OF A CIRCUIT DESIGN WITHIN A DESIGN CONTAINER
A key block can be generated from a session key used by a computer-based design tool for a circuit design by encrypting the session key using computer hardware. The key block can be divided, by the computer hardware, into a plurality of sub-blocks. A plurality of enhanced sub-blocks can be generated by the computer hardware by encrypting each sub-block of the plurality of sub-blocks with a different key of a plurality of keys corresponding to a plurality of Intellectual Property (IP) cores of the circuit design. The plurality of enhanced sub-blocks can be stored in a memory.
G06F 21/72 - Protection de composants spécifiques internes ou périphériques, où la protection d'un composant mène à la protection de tout le calculateur pour assurer la sécurité du calcul ou du traitement de l’information dans les circuits de cryptographie
G06F 30/392 - Conception de plans ou d’agencements, p. ex. partitionnement ou positionnement
G06F 115/08 - Blocs propriété intellectuelle [PI] ou cœur PI
Low-latency gigabit transceiver PHY-based signal switching for emulation, prototyping, and high performance computing (HPC) in a computing platform that includes multiple ICs, where a first one of the ICs includes functional circuitry, a receiver that receives a signal from a second one of the ICs, a transmitter that transmits outgoing data to a third one of the ICs, and a bypass circuit that provides an output of the receiver to one of the functional circuitry and the transmitter (e.g., based on a destination address). The bypass circuit may bypass the functional circuitry, and may further bypass a receive-side media access controller (MAC) and a transmit-side MAC. The IC may multiplex outgoing data to the transmitters. Selectable functions of PHY circuitry may be disabled in bypass mode. The ICs may include field-programmable gate arrays, which may be programmed to emulate respective partitions of a circuit design and/or to perform other functions.
Examples herein describe alignment detection circuitry. The alignment detection circuitry includes a buffer, a first set of correlators, and a second set of correlators. The buffer is configured to output a data stream of multiplexed groups of symbols from multiple data lanes. The first set of correlators is configured to search a candidate data lane of the data stream for bits matching bits of a reference alignment marker based on a first search method. The second set of correlators is configured to search the candidate data lane of the data stream for bits matching the bits of the reference alignment marker based on a second search method.
Computer-based co-simulation includes simulating a circuit design and a co-simulation model configured to model circuitry that operates in coordination with a hardware implementation of the circuit design. In response to a request for a data transfer received by the co-simulation model from the circuit design, a ready signal is provided from the co-simulation model to the circuit design after a first predetermined number of simulation clock cycles corresponding to an initiation interval of the circuitry modeled by the co-simulation model. In response to receiving state information for the data transfer, a response from the co-simulation model is provided to the circuit design after a second predetermined number of simulation clock cycles corresponding to a response time of the circuitry modeled by the co-simulation model.
Methods for fabricating an integrated circuit (IC) device, an IC die configured for probe testing, and an IC device are described therein. In one example, the method includes: forming a conductive cap above and in electrical contact with two or more of a pillars, each pillar coupled to a power contact pads of an IC die, removing the cap after testing; and depositing a hybrid bonding layer over the IC die device, the hybrid bonding layer having hybrid bond pads coupled the plurality of power contact pads and the signal contact pads of the IC die.
H01L 21/66 - Test ou mesure durant la fabrication ou le traitement
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
38.
DYNAMIC DATA CONVERSION FOR NETWORK COMPUTER SYSTEMS
A computing node for a computing system includes a processor, conversion circuitry, and routing circuitry. The processor generates a data signal based on a function of an application executed by the computing system. The data signal has a first precision format and a first sparse representation. The conversion circuitry receives the data signal from the processor and generate a converted data signal by at least one of converting the first precision format to a second precision format and converting the first sparse representation to a second sparse representation. The routing circuitry transmits the converted data signal to switch circuitry of the computing system.
A computer-implemented method for task management can include managing performance of a task on a message by a plurality of circuits. In some aspects, the task can comprise a sequence of processings to be performed on the message and each circuit of the plurality of circuits performing a processing of the sequence of processings. In some aspects, the method can include routing, based on the sequence, a first information regarding the task to a first circuit of the plurality of circuits to perform a first processing of the sequence of processings on the message; receiving, from the first circuit, an output of the first processing; and routing, based on the sequence of processings identified for the task, a second information regarding the task to a second circuit of the plurality of circuits to perform a second processing that follows the first processing in the sequence of processings.
A memory includes a read circuit having a first primitive configured to output a first data item based on least significant bits (LSBs) of a read address and a multiplexer coupled to the primitive. The multiplexer outputs a selected bit from the first data item as read data based on most significant bits (MSBs) of the read address. The memory includes a write circuit having a second primitive that outputs a second data item based on LSBs of a write address and a modifier circuit that generates a third data item by modifying a bit of the second data item to correspond to write data. The bit is at a location within the second data item selected based on MSBs of the write address. The modifier circuit writes the third data item to a location in the write primitive based on the LSBs of the write address.
G06F 30/327 - Synthèse logiqueSynthèse de comportement, p. ex. logique de correspondance, langage de description de matériel [HDL] à liste d’interconnections [Netlist], langage de haut niveau à langage de transfert entre registres [RTL] ou liste d’interconnections [Netlist]
G06F 30/323 - Traduction ou migration, p. ex. logique à logique, traduction de langage descriptif de matériel ou traduction de liste d’interconnections [Netlist]
41.
APPARATUS AND METHOD OF PRINTING SOLDER ON PRINTED CIRCUIT BOARD FOR WARPAGE COMPENSATION
A method of attaching a chip package to a printed circuit board (“PCB”) is provided, along with an electronic device fabricated using the method. The method includes measuring a warpage parameter of the chip package and selecting a stencil configured to compensate for warpage corresponding to the measured warpage parameter. The stencil includes a plurality of apertures. The selected stencil is positioned above the PCB, and solder paste is applied on the PCB via the plurality of apertures of the stencil. Thereafter, the PCB is moved away from the stencil. The chip package is positioned on the solder paste on the PCB, thereby attaching the chip package to the PCB.
H05K 3/12 - Appareils ou procédés pour la fabrication de circuits imprimés dans lesquels le matériau conducteur est appliqué au support isolant de manière à former le parcours conducteur recherché utilisant la technique de l'impression pour appliquer le matériau conducteur
H05K 13/08 - Contrôle de la fabrication des ensembles
An implementation may include a method for performing a binary multiplication including receiving a first at an input interface of a digital multiplier circuit in the computing system, receiving a second operand at the input interface of the digital multiplier circuit, generating, by the digital multiplier circuit, partial products by performing a AND operation with each of the N bits of the first operand and each of the bits of the second operand, and generating first modified partial products by modifying, by the digital multiplier circuit, most significant bits of the partial products, generating second modified partial products by modifying, by the digital multiplier circuit, one of the first modified partial product, generating, by the digital multiplier circuit, a product by summing the second modified partial products, and outputting the product from an output interface of the digital multiplier circuit.
G06F 5/01 - Procédés ou dispositions pour la conversion de données, sans modification de l'ordre ou du contenu des données maniées pour le décalage, p. ex. la justification, le changement d'échelle, la normalisation
H03K 19/21 - Circuits OU EXCLUSIF, c.-à-d. donnant un signal de sortie si un signal n'existe qu'à une seule entréeCircuits à COÏNCIDENCES, c.-à-d. ne donnant un signal de sortie que si tous les signaux d'entrée sont identiques
43.
EFFICIENT METHOD FOR THE LATCH TIMING ANALYSIS OF ELECTRONIC DESIGNS
Performing timing analysis of a circuit design includes building a timing graph of the circuit design, and determining delays of devices and wires of the circuit design based on the timing graph. Further, clock and arrival propagations for the circuit design are performed based on the delays of the devices and wires, latch loops are identified in the circuit design, and latch analysis on latches of the latch loops is performed. The timing analysis further includes performing arrival propagation for circuit elements of the circuit design impacted by the latch analysis performed on the latches of the latch loops, performing latch analysis on latches of the circuit design external to the latch loops, and performing required time and slack calculations on the circuit design.
Embodiments herein describe a circuit for detecting a single event upset (SEU). The circuit includes a latch including an output node, a first parity node, and a second parity node and correction circuitry configured to correct a single event upset (SEU) at the output node using the first and second parity nodes.
An example integrated circuit (IC) system includes a package substrate having a programmable integrated circuit (IC) and a companion IC mounted thereon, the programmable IC including a programmable fabric and the companion IC including application circuitry. The IC system further includes a system-in-package (SiP) bridge including a first SiP IO circuit disposed in the programmable IC, a second SiP IO circuit disposed in the companion IC, and conductive interconnect on the package substrate electrically coupling the first SiP IO circuit and the second SiP IO circuit. The IC System further includes first aggregation and first dispersal circuits in the programmable IC coupled between the programmable fabric and the first SiP IO circuit. The IC system further includes second aggregation and second dispersal circuits in the companion IC coupled between the application circuitry and the second SiP IO circuit.
Embodiments herein describe a learnable transform block disposed before, or in between, the neural network layers to transform received data into a more computational-friendly domain while preserving discriminative features required for the neural network to generate accurate results. In one embodiment, during a training phase, an AI system learns parameters for the transform block that are then used during the inference phase to transform received data into the computational-friendly domain that has a reduced size input. The transformed data may require less compute resources or less memory usage to process by the underlying hardware device that hosts the neural network.
The disclosed device includes a processor and an interconnect connecting the processor to a memory. The interconnect includes an interconnect agent that can forward memory requests from the processor to the memory and receive requested data returned by the memory. The requested data can include information for a next memory request such that the interconnect agent can send, to the memory, a speculative memory request using information for the next memory request that was received in response to the memory request. Various other methods, systems, and computer-readable media are also disclosed.
G06F 12/0811 - Systèmes de mémoire cache multi-utilisateurs, multiprocesseurs ou multitraitement avec hiérarchies de mémoires cache multi-niveaux
G06F 12/1027 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB]
A multi-chiplet system includes a first chiplet comprising a first transceiver and a first chiplet-to-chiplet (C2C) interface module, and a second chiplet comprising programmable logic circuitry and a second C2C interface module. The first transceiver is configured to generate a clock, which is transmitted from the first C2C interface module to the second C2C interface module, through a clock transmission wire, for data transfer between the first chiplet and the second chiplet.
Described herein are systems and methods for managing error detection in a message. A circuit can identify, based on an error detection configuration of the at least one circuit, a first portion of the message to be checked for errors before a second portion of the message is available to the at least one circuit, the first portion being less than all of the message to be checked for one or more errors. A circuit can analyze a number of bits of the first portion of the message using the at least one circuit and based on the error detection configuration. A circuit can, based on analyzing the first portion, determine whether the message includes the one or more errors. Various other methods, systems, and computer-readable media are also disclosed.
The disclosed device includes a processor and an interconnect connecting the processor to a memory. The interconnect includes an interconnect agent that can forward memory requests from the processor to the memory and receive requested data returned by the memory. The requested data can include information for a next memory request such that the interconnect agent can send, to the memory, a speculative memory request using information for the next memory request that was received in response to the memory request. Various other methods, systems, and computerreadable media are also disclosed.
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 12/0811 - Systèmes de mémoire cache multi-utilisateurs, multiprocesseurs ou multitraitement avec hiérarchies de mémoires cache multi-niveaux
G06F 12/0862 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec pré-lecture
An integrated circuit die stack and method thereof are described herein that is capable of detecting a physical tampering event. The integrated circuit die stack includes a first integrated circuit die including a sensor network that extends substantially across an entire top surface of the first integrated circuit die, and a second integrated circuit die stacked below the first integrated circuit die. The second integrated circuit die is configured to receive sensing signals generated by the sensor network via a plurality of through-silicon-vias coupled with the first integrated circuit die and the second integrated circuit die.
G11C 19/28 - Mémoires numériques dans lesquelles l'information est déplacée par échelons, p. ex. registres à décalage utilisant des éléments semi-conducteurs
H01L 25/18 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant de types prévus dans plusieurs différents groupes principaux de la même sous-classe , , , , ou
H10B 80/00 - Ensembles de plusieurs dispositifs comprenant au moins un dispositif de mémoire couvert par la présente sous-classe
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 23/48 - Dispositions pour conduire le courant électrique vers le ou hors du corps à l'état solide pendant son fonctionnement, p. ex. fils de connexion ou bornes
52.
PRUNING OF TECHNOLOGY-MAPPED MACHINE LEARNING-RELATED CIRCUITS AT BIT-LEVEL GRANULARITY
Embodiments herein describe pruning of technology-mapped machine learning-related circuits at bit-level granularity, including techniques to efficiently remove look-up tables (LUTs) of a technology-mapped netlist while maintaining a baseline accuracy of an underlying machine learning model. In an embodiment, a LUT output of a current circuit design is replaced with a constant value, and at least the LUT and LUTs within a maximum fanout-free cone (MFFC) are removed, to provide an optimized circuit design. The current circuit design or the optimized circuit design is selected as a solution based on corresponding training data-based accuracies and metrics (e.g., LUT utilization), and optimization criteria. If the optimized circuit design is rejected, inputs to the LUT may be evaluated for pruning. A set of solutions may be evaluated based on validation data-based accuracies and metrics of the corresponding circuit design. Solutions that do not meet a baseline accuracy may be discarded.
An integrated circuit die stack and method thereof are described herein that is capable of detecting a physical tampering event. The integrated circuit die stack includes a first integrated circuit die including a sensor network that extends substantially across an entire top surface of the first integrated circuit die, and a second integrated circuit die stacked below the first integrated circuit die. The second integrated circuit die is configured to receive sensing signals generated by the sensor network via a plurality of through-silicon-vias coupled with the first integrated circuit die and the second integrated circuit die.
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
H01L 25/18 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant de types prévus dans plusieurs différents groupes principaux de la même sous-classe , , , , ou
H01L 23/48 - Dispositions pour conduire le courant électrique vers le ou hors du corps à l'état solide pendant son fonctionnement, p. ex. fils de connexion ou bornes
H10B 80/00 - Ensembles de plusieurs dispositifs comprenant au moins un dispositif de mémoire couvert par la présente sous-classe
H01L 25/00 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
A multi-chiplet system includes a first chiplet comprising a first transceiver and a first chiplet-to-chiplet (C2C) interface module, and a second chiplet comprising programmable logic circuitry and a second C2C interface module. The first transceiver is configured to generate a clock, which is transmitted from the first C2C interface module to the second C2C interface module, through a clock transmission wire, for data transfer between the first chiplet and the second chiplet.
A circuit design emulation system having a plurality of integrated circuits (ICs) includes a first IC. The first IC includes an originator circuit configured to issue a request of a transaction directed to a completer circuit. The request is specified in a communication protocol. The first IC includes a completer transactor circuit coupled to the originator circuit and configured to translate the request into request data. The first IC includes a first interface circuit configured to synchronize the request data from an originator clock domain to a transceiver clock domain operating at a higher frequency than the originator clock domain. The first IC includes a first transceiver circuit configured to convey the request data over a communication link that operates asynchronously to the originator clock domain.
G06F 9/455 - ÉmulationInterprétationSimulation de logiciel, p. ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
56.
DEVICES, SYSTEMS, AND METHODS FOR A PROGRAMMABLE THREE-DIMENSIONAL SEMICONDUCTOR POWER DELIVERY NETWORK
A disclosed semiconductor device includes (1) a silicon stack comprising a front-side Back-End-of-Line (BEOL) stack and a back side BEOL stack, the front-side BEOL stack comprising a plurality of signal routes and the back-side BEOL stack comprising a plurality of power delivery routes, and (2) a plurality of auxiliary power paths formed within the front-side BEOL stack and electrically coupled to the plurality of power delivery routes of the back-side BEOL stack via a plurality of programmable switches, the plurality of power delivery routes, the plurality of programmable switches, and the plurality of auxiliary power paths forming a programmable power delivery network (PDN). Various other apparatuses, systems, and methods of operation are also disclosed.
A system-on-chip (SoC) has programmable logic and a processor. A design tool generates configuration data to implement circuitry for emulation of a design-under-test (DUT) on the programmable logic and generates testbench executable code. The testbench executable code is configured to generate stimuli to the circuitry on the programmable logic. The processor can be configured to execute the testbench executable code and the programmable logic can be configured to implement the circuitry for emulation of the DUT.
An analog-to-digital converter (ADC) circuitry includes channels that are interleaved with each other to generate output digital signals from input analog signals. A first channel includes sub-ADC circuitry, amplitude detection circuitry, and correction circuitry. Random chopping is applied by chopping circuitry at the input of the sub-ADC circuitry while sampling. The sub-ADC circuitry outputs digital data corresponding to the chopping states. Gain mismatch within the chopping circuitry is mitigated by determining correction values via the amplitude detection circuitry and the correction circuitry and applying the correction values to the output of the sub-ADC circuitry. The amplitude detection circuitry determines an amplitude difference between data signals. The correction circuitry is coupled to the output of the amplitude detection circuitry. The correction circuitry generates the correction values based on the amplitude difference, and outputs the correction values to adjust the data signals.
Examples herein describe inductor circuitry including an inductor coil having a helical shape. The inductor coil includes a first turn and a second turn which are disposed within an isolation wall. The isolation wall extends above the inductor coil and below the inductor coil. The inductor circuitry includes an inductor leg which extends through an aperture of the isolation wall. The inductor leg includes a first portion which is disposed within the isolation wall and a second portion that is disposed outside of the isolation wall.
A disclosed semiconductor device includes (1) a silicon stack comprising a front-side Back-End-of-Line (BEOL) stack and a back side BEOL stack, the front-side BEOL stack comprising a plurality of signal routes and the back-side BEOL stack comprising a plurality of power delivery routes, and (2) a plurality of auxiliary power paths formed within the front-side BEOL stack and electrically coupled to the plurality of power delivery routes of the back-side BEOL stack via a plurality of programmable switches, the plurality of power delivery routes, the plurality of programmable switches, and the plurality of auxiliary power paths forming a programmable power delivery network (PDN). Various other apparatuses, systems, and methods of operation are also disclosed.
H01L 23/528 - Configuration de la structure d'interconnexion
H01L 23/48 - Dispositions pour conduire le courant électrique vers le ou hors du corps à l'état solide pendant son fonctionnement, p. ex. fils de connexion ou bornes
61.
SCHEDULING KERNELS ON A DATA PROCESSING SYSTEM WITH ONE OR MORE COMPUTE CIRCUITS
Scheduling kernels on a system with heterogeneous compute circuits includes receiving, by a hardware processor, a plurality of kernels and a graph including a plurality of nodes corresponding to the plurality of kernels. The graph defines a control flow and a data flow for the plurality of kernels. The kernels are implemented within different ones of a plurality of compute circuits coupled to the hardware processor. A set of buffers for performing a job for the graph are allocated based, at least in part, on the data flow specified by the graph. Different ones of the kernels as implemented in the compute circuits are invoked based on the control flow defined by the graph.
A transceiver circuit is disclosed. The transceiver circuit includes a transmitter driver circuit configured to drive a transmit antenna. The transceiver circuit also includes a receiver circuit configured to generate digital signals based on received signals. The transceiver circuit also includes a loopback data path circuit electrically connected to the transmitter driver circuit and to the receiver circuit, where the loopback data path circuit is configured to conditionally provide signals from the transmitter driver circuit to the receiver circuit according to one or more control signals. The transceiver circuit also includes a controller configured to generate the control signals.
H02H 9/04 - Circuits de protection de sécurité pour limiter l'excès de courant ou de tension sans déconnexion sensibles à un excès de tension
H04B 1/48 - Commutation transmission-réception dans des circuits pour connecter l'émetteur et le récepteur à une voie de transmission commune, p. ex. par l'énergie de l'émetteur
H04B 17/14 - SurveillanceTests d’émetteurs pour l’étalonnage de l’ensemble voie d’émission/voie de réception, p. ex. bouclage d’autotest
Disclosed herein is a chip package assembly that includes a package substrate coupled with an integrated circuit die, a stiffener attached to a top surface of the package substrate, and a connector assembly integrated with the stiffener. Both the connector assembly and the stiffener are disposed at a peripheral area of the top surface. The connector assembly includes a bracket and a connector. The connector is configured to connect with one or more optical cables or electrical connectors. The bracket may be formed by a cavity in the stiffener. The bracket may be attached to the top surface of the package substrate. The stiffener may be coupled with the bracket directly or via the connector. Additionally, a frame coupled to the stiffener or a PCB board may be used to secure the bracket in place.
H01L 23/498 - Connexions électriques sur des substrats isolants
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 23/043 - ConteneursScellements caractérisés par la forme le conteneur étant une structure creuse ayant une base conductrice qui sert de support et en même temps de connexion électrique pour le corps semi-conducteur
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
H05K 1/14 - Association structurale de plusieurs circuits imprimés
64.
RECLAMATION OF MEMORY ECC BITS FOR ERROR TOLERANT NUMBER FORMATS
A method for operating a computing system includes determining a baseline accuracy of the computing system based on a baseline data transmission format comprising a baseline quantity of data bits and a baseline quantity of error correction (ECC) bits, determining sample accuracies of the computing system based on sample data transmission formats each including a quantity of data bits and a quantity of ECC bits that are different from the baseline quantity of data bits and the baseline quantity of ECC bits, and storing data in a memory device of the computing system using at least one data transmission format, wherein the at least one data transmission format is selected from a group of data transmission formats comprising the baseline data transmission format and the sample data transmission formats and the at least one data transmission is selected based on the baseline accuracy and the sample accuracies.
G06F 11/07 - Réaction à l'apparition d'un défaut, p. ex. tolérance de certains défauts
G06F 11/10 - Détection ou correction d'erreur par introduction de redondance dans la représentation des données, p. ex. en utilisant des codes de contrôle en ajoutant des chiffres binaires ou des symboles particuliers aux données exprimées suivant un code, p. ex. contrôle de parité, exclusion des 9 ou des 11
65.
INTERCONNECT CIRCUIT FOR MULTI-CHANNEL AND MULTI-REQUESTER MEMORY SYSTEMS
An integrated circuit device includes interconnect circuitry. The interconnect circuitry includes interleaving switch circuitries, network switch circuitries, and crossbar circuitries. The interleaving switch circuitries are coupled to requester devices. A first interleaving switch circuitry includes first ports. The first interleaving switch circuitry receives a first memory command, and outputs the first memory command via first communication lanes connected to a first port based on a memory address of the first memory command. The network switch circuitries are connected to the interleaving switch circuitries. A first network switch circuitry is connected to the first communication lanes and route the first memory command along the first communication lanes based on the memory address. A first crossbar circuitry of the crossbar circuitries receives the first memory command from the first communication lanes, and outputs the first memory command to a first memory device of the memory devices associated with the memory.
Embodiments herein describe using virtual destinations to route packets through a NoC (105). In one embodiment, instead of decoding an address into a target destination ID of the NoC (105), an ingress logic block (115) assigns packets for multiple different targets the same virtual destination ID. For example, these targets may be in the same segment or location of the NoC (105). Thus, instead of the ingress logic block (115) having to store entries in a lookup-table for each target, it can have a single entry for the virtual destination ID. The packets for the targets are then routed using the virtual destination ID to a decoder switch (140) in the NoC (105). This decoder switch (140) can then use the address in the packet (which is different than the destination ID) to select the appropriate target destination ID.
H04L 49/101 - Éléments de commutation de paquets caractérisés par la construction de la matrice de commutation utilisant un crossbar ou une matrice
H04L 49/109 - Éléments de commutation de paquets caractérisés par la construction de la matrice de commutation intégrés sur micropuce, p. ex. interrupteurs sur puce
Generating low skew clock solutions for local clocks in an integrated circuit includes, for a circuit design, determining a plurality of delay ranges for respective clock pins of a local clock net. Each delay range of the plurality of delay ranges includes an upper bound delay and a lower bound delay. The upper bound delays of the plurality of delay ranges are allocated as setup constraints for the respective clock pins of the local clock net. The lower bound delays are allocated as hold constraints for the respective clock pins of the local clock net. The local clock net is routed using the setup constraints and the hold constraints.
Control set optimization for a circuit design includes generating, by a processor, Observability Don't Care (ODC) expressions for registers of the circuit design. Redundant reset pins of the registers of the circuit design are determined by the processor by iteratively checking, on a per-cube and a per-literal basis for each ODC expression, whether a value of a literal causes the ODC expression to evaluate to 1. A modified version of the circuit design is generated by the processor by connecting one or more reset pins of the set of redundant reset pins to one or more constants.
G06F 30/398 - Vérification ou optimisation de la conception, p. ex. par vérification des règles de conception [DRC], vérification de correspondance entre géométrie et schéma [LVS] ou par les méthodes à éléments finis [MEF]
69.
INTERCONNECT CIRCUITRY FOR MULTI-CHANNEL AND MULTI-REQUESTER MEMORY SYSTEMS
An integrated circuit device includes interconnect circuitry. The interconnect circuitry includes interleaving switch circuitries, network switch circuitries, and crossbar circuitries. The interleaving switch circuitries are coupled to requester devices. A first interleaving switch circuitry includes first ports. The first interleaving switch circuitry receives a first memory command, and outputs the first memory command via first communication lanes connected to a first port based on a memory address of the first memory command. The network switch circuitries are connected to the interleaving switch circuitries. A first network switch circuitry is connected to the first communication lanes and route the first memory command along the first communication lanes based on the memory address. A first crossbar circuitry of the crossbar circuitries receives the first memory command from the first communication lanes, and outputs the first memory command to a first memory device of the memory devices associated with the memory
Some examples described herein provide for display image data reliability and safety, for example end-to-end safety methods, apparatuses, and systems for display systems. One example includes a method, including replacing video frames from input video streams with a set of test frames. The method further includes generating an alpha-blended video stream based on the set of test frames and the input video streams. The method further includes generating and inserting cyclic redundancy check (CRC) information for the set of test frames into secondary data packets associated with the alpha-blended video stream. The method further includes processing the set of test frames and video frames by a display controller to generate an output video stream. The method further includes performing an error detection procedure for the set of test frames using the CRC information to detect an error associated with the set of video frames.
Some examples described herein provide for instruction glitch protection in an integrated circuit. In an example, a method includes generating a random number by the integrated circuit. The method also includes identifying, based at least in part on the generated random number, a sequence from a set of sequences stored in a memory of the integrated circuit, each sequence of the set of sequences corresponding to an order of execution for a plurality of tasks. The method further includes performing, by the integrated circuit, each task of the plurality of tasks in the order of execution corresponding to the identified sequence.
In one example, a micro device includes a housing; a chip package disposed in the housing; a noise producing component coupled to the housing. The micro device also includes a noise reduction system having a reference microphone for detecting a noise from the noise producing component and a controller configured to receive the noise from the reference microphone and generate a masking sound signal in response to the detected noise. A speaker is coupled to the housing for producing a masking sound corresponding to the masking sound signal, whereby the masking sound reduces the noise. In another example, the noise producing component comprises a fan.
G10K 11/178 - Procédés ou dispositifs de protection contre le bruit ou les autres ondes acoustiques ou pour amortir ceux-ci, en général utilisant des effets d'interférenceMasquage du son par régénération électro-acoustique en opposition de phase des ondes acoustiques originales
73.
NETWORK-ON-CHIP ARCHITECTURE WITH DESTINATION VIRTUALIZATION
Embodiments herein describe using virtual destinations to route packets through a NoC. In one embodiment, instead of decoding an address into a target destination ID of the NoC, an ingress logic block assigns packets for multiple different targets the same virtual destination ID. For example, these targets may be in the same segment or location of the NoC. Thus, instead of the ingress logic block having to store entries in a lookup-table for each target, it can have a single entry for the virtual destination ID. The packets for the targets are then routed using the virtual destination ID to a decoder switch in the NoC. This decoder switch can then use the address in the packet (which is different than the destination ID) to select the appropriate target destination ID.
Memory driver circuitry for driving a memory cell or cells of a memory device includes first driver path circuitry and selection circuitry. The first driver path circuitry includes driver circuitry that outputs a first signal and selection circuitry that receives the first signal and a second signal, and outputs a first selected signal. The first selected signal is a selected one of the first signal and the second signal. The selection circuitry of the memory driver circuitry receives a third signal and a fourth signal, and outputs a bias voltage signal to header circuitry of a memory cell. The bias voltage signal is a selected one of the third signal and the fourth signal. The third signal corresponds to the first selected signal.
Examples herein describe a scalable tweak engine and prefetching tweak values. Regarding the scalable tweak engine, it can be designed to accommodate different bus widths of data. The scalable tweak engine described herein includes multiple tweak calculators that can be daisy chained together to output multiple tweak values every clock cycle. These tweak values can be sent to multiple encryption cores so that multiple data blocks can be encrypted in parallel. Regarding prefetching tweak values, previous encryption engines incur a delay as the tweak value (e.g., a metadata value) for a data block is calculated. In the embodiments herein, the encryption engine can include an independent metadata engine that determines the metadata value for a subsequent data block while the current data block is being encrypted.
H04L 9/06 - Dispositions pour les communications secrètes ou protégéesProtocoles réseaux de sécurité l'appareil de chiffrement utilisant des registres à décalage ou des mémoires pour le codage par blocs, p. ex. système DES
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
A system for clock variation measurement includes a first clock counter circuit configured to generate a plurality of first counts of a first clock signal, a second clock counter circuit configured to generate a plurality of second counts of a second clock signal, a first synchronizer circuit configured to synchronize the plurality of first counts according to a third clock signal, and a second synchronizer circuit configured to synchronize the plurality of second counts according to the third clock signal. The system includes a difference circuit configured to generate a plurality of differences from respective count pairs as synchronized. The system includes a variation circuit configured to generate a variation signal indicating an amount of variation between the first clock signal and the second clock signal based, at least in part, on the plurality of differences.
A level shifter may include a first transistor stack including at least four transistors arranged from a first voltage source to ground, including second and third transistors coupled with bias voltage source, and a fourth transistor coupled with an input to receive an input signal at a second voltage or ground. The level shifter may include a second transistor stack comprising at least four transistors arranged from the first voltage source to ground, including second and third transistors coupled with the bias voltage source, and a fourth transistor to receive an inverse of the input signal. A first transistor of the first transistor stack is cross-coupled with a first transistor of the second transistor stack. A level shifter may include a first output coupled with the second transistor stack between the second and third transistors to provide a first output signal at the first voltage or ground.
A method for predicting voltage drop on a power delivery network of a 3D stacked device includes receiving a spatial power distribution map of a plurality of semiconductor dies of the 3D stacked device, receiving a spatial power source node location map for a plurality of power source nodes coupled to the 3D stacked device, dividing vertically the spatial power distribution map and the spatial power source node location map into overlapping windows, determining a voltage drop map in each of the windows based on the divided spatial power distribution map and the divided spatial power source node location map, and combining the voltage drop map in each of the windows to form a composite voltage drop map.
G06F 30/398 - Vérification ou optimisation de la conception, p. ex. par vérification des règles de conception [DRC], vérification de correspondance entre géométrie et schéma [LVS] ou par les méthodes à éléments finis [MEF]
G06F 30/392 - Conception de plans ou d’agencements, p. ex. partitionnement ou positionnement
79.
SYSTEM-LEVEL TECHNIQUES FOR ERROR CORRECTION IN CHIP-TO-CHIP INTERFACES
Some examples described herein provide for interconnect in chiplet systems, for example system-level techniques for error correction in chip-to-chip interfaces. In an example, a method of error correction includes receiving, at a first chiplet, a data message via a set of interconnect, and transmitting a first control message that requests retransmission of the data message based on detecting an error associated with receiving the data message. The method also includes transmitting one or more instances of a second control message that indicates an idle operation at the first chiplet until the first chiplet receives a third control message that triggers an end of a retransmission mode. The method also includes transmitting a fourth control message frame indicating the end of the retransmission mode, and receiving a retransmission of the data message from the second chiplet.
Embodiments herein describe a host that polls a network adapter to receive data from a network. That is, the host/CPU/application thread polls the network adapter (e.g., the network card, NIC, or SmartNIC) to determine whether a packet has been received. If so, the host informs the network adapter to store the packet (or a portion of the packet) in a CPU register (205). If the requested data has not yet been received by the network adapter from the network (210), the network adapter can delay (230) the responding to the request to provide extra time for the adapter to receive the data from the network.
G06F 13/12 - Commande par programme pour dispositifs périphériques utilisant des matériels indépendants du processeur central, p. ex. canal ou processeur périphérique
G06F 13/22 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le balayage successif, p. ex. l'appel sélectif
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
G06F 13/38 - Transfert d'informations, p. ex. sur un bus
G06F 13/42 - Protocole de transfert pour bus, p. ex. liaisonSynchronisation
G06F 13/366 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus ou au système à bus communs avec commande d'accès centralisée utilisant un arbitre d'interrogation centralisé
H04L 47/56 - Ordonnancement des files d’attente en implémentant un ordonnancement selon le délai
H04L 47/30 - Commande de fluxCommande de la congestion en combinaison avec des informations sur l'occupation de mémoires tampon à chaque extrémité ou aux nœuds de transit
G06F 13/32 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant la combinaison d'interruption et de transfert par rafale
Embodiments herein describe a host that polls a network adapter to receive data from a network. That is, the host/CPU/application thread polls the network adapter (e.g., the network card, NIC, or SmartNIC) to determine whether a packet has been received. If so, the host informs the network adapter to store the packet (or a portion of the packet) in a CPU register. If the requested data has not yet been received by the network adapter from the network, the network adapter can delay the responding to the request to provide extra time for the adapter to receive the data from the network.
H04L 43/103 - Surveillance active, p. ex. battement de cœur, utilitaire Ping ou trace-route avec interrogation adaptative, c.-à-d. adaptation dynamique du taux d'interrogation
H04L 67/1097 - Protocoles dans lesquels une application est distribuée parmi les nœuds du réseau pour le stockage distribué de données dans des réseaux, p. ex. dispositions de transport pour le système de fichiers réseau [NFS], réseaux de stockage [SAN] ou stockage en réseau [NAS]
82.
BUILDING MULTI-DIE FPGAS USING CHIP-ON-WAFER TECHNOLOGY
Embodiments herein describe techniques to build multi-die fieldprogrammable gate arrays (FPGAs) using chip-on-wafer (CoW) technology. In an embodiment, FPGA chiplets (i.e., dies) and an interposer substrate include respective hybrid bonding connectors. Metal layers of the interposer substrate are patterned to provide inter-die communications amongst the multiple dies via the hybrid bonding connectors, and the dies communicate with one another via the hybrid bonding connectors using a non-serialized protocol native to the FPGA. The dies may communicate with one another through edge-based hybrid bonding connectors (e.g., in a symmetrical fashion). The metal layers of the interposer substrate may also support intra-die communications (e.g., data, clocks, and/or controls) and/or provide power, clock(s), and/or configuration parameters to the dies via hybrid bonding connectors within central regions of the dies. The IC device may include more than 1000 tracks per millimeter (e.g., more than 1600, 2800, 3500, or greater).
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
G06F 30/34 - Conception de circuits pour circuits reconfigurables, p. ex. réseaux de portes programmables [FPGA] ou circuits logiques programmables [PLD]
H01L 23/538 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre la structure d'interconnexion entre une pluralité de puces semi-conductrices se trouvant au-dessus ou à l'intérieur de substrats isolants
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
Examples herein describe techniques for producing a three-dimensional (3D) die stack. The techniques include stacking a first die on top of a second die. The first die is offset from the second die in at least one of an x-direction and a y-direction, and a first routing sub-region of the first die aligns with a second routing sub-region of the second die. The techniques further include stacking a third die on top of the second die. The third die is offset from the second die in at least one of the x-direction and the y-direction, and a third routing sub-region of the third die aligns with a fourth routing sub-region of the second die.
H01L 25/00 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H10B 80/00 - Ensembles de plusieurs dispositifs comprenant au moins un dispositif de mémoire couvert par la présente sous-classe
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
H01L 25/18 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant de types prévus dans plusieurs différents groupes principaux de la même sous-classe , , , , ou
84.
MULTI-HOST AND MULTI-CLIENT DIRECT MEMORY ACCESS SYSTEM HAVING A READ SCHEDULER
A direct memory access (DMA) system includes a read request circuit configured to receive read requests from a plurality of client circuits. The DMA system includes a response reassembly circuit configured to reorder read completion data received from a plurality of different hosts in response to the read requests. The DMA system includes a read scheduler circuit configured to schedule conveyance of the read completion data from the response reassembly circuit to the plurality of client circuits. The DMA system includes a data pipeline circuit including a plurality of data paths. The plurality of data paths are configured to convey the read completion data as scheduled by the read scheduler circuit to respective ones of the plurality of client circuits.
G06F 13/28 - Gestion de demandes d'interconnexion ou de transfert pour l'accès au bus d'entrée/sortie utilisant le transfert par rafale, p. ex. acces direct à la mémoire, vol de cycle
Embodiments herein describe devices that indude an interposer with a stitch formed from overlapping exposure areas, which may result in the interposer having a total surface area that is greater than a maximum reticle field corresponding to the exposure areas. Two or more Integrated circuits (e.g., chiplets) can be disposed on the interposer. At least one of the integrated circuits is disposed over the stitch. The interposer can provide chip-to-chip connections between the integrated circuits.
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
H01L 23/538 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre la structure d'interconnexion entre une pluralité de puces semi-conductrices se trouvant au-dessus ou à l'intérieur de substrats isolants
A processor [102] employs a hardware signal monitor [110] to manage signaling for accelerators [103, 104]. The hardware signal monitor monitors designated memory addresses assigned to accelerator signals. In response to a memory write [112] to one of the designated memory addresses, the hardware signal monitor executes a set of one or more operations (referred to as a callback). The hardware signal monitor thereby enables improved and enhanced signaling features, such as asynchronous signaling between agents, inter-accelerator signaling, and inter-process signaling.
Examples herein describe a three-dimensional (3D) die stack. The 3D die stack includes a programmable logic (PL) die and a compute die stacked on top of the PL die. The PL die includes a plurality of configurable blocks and a plurality of first electrical connections on a top side of the PL die. The compute die includes a plurality of data processing engines and a plurality of second electrical connections on a bottom side of the compute die. The three-dimensional die stack includes a plurality of tiles, each tile comprising M configurable blocks included in the plurality of configurable blocks and N data processing engines included in the plurality of data processing engines.
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
H01L 25/18 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant de types prévus dans plusieurs différents groupes principaux de la même sous-classe , , , , ou
H10B 80/00 - Ensembles de plusieurs dispositifs comprenant au moins un dispositif de mémoire couvert par la présente sous-classe
A memory device includes a first bit cell comprising a first inverter, the first inverter comprising a p-type transistor coupled to an n-type transistor, and header circuitry coupled to the first inverter and comprising a first header transistor and a second header transistor, the first header transistor having a gate configured to receive a bias voltage, the second header transistor having a gate configured to receive a reference voltage.
Embodiments herein describe techniques to build multi-die field-programmable gate arrays (FPGAs) using chip-on-wafer (CoW) technology. In an embodiment, FPGA chiplets (i.e., dies) and an interposer substrate include respective hybrid bonding connectors. Metal layers of the interposer substrate are patterned to provide inter-die communications amongst the multiple dies via the hybrid bonding connectors, and the dies communicate with one another via the hybrid bonding connectors using a non-serialized protocol native to the FPGA. The dies may communicate with one another through edge-based hybrid bonding connectors (e.g., in a symmetrical fashion). The metal layers of the interposer substrate may also support intra-die communications (e.g., data, clocks, and/or controls) and/or provide power, clock(s), and/or configuration parameters to the dies via hybrid bonding connectors within central regions of the dies. The IC device may include more than 1000 tracks per millimeter (e.g., more than 1600, 2800, 3500, or greater).
H01L 23/498 - Connexions électriques sur des substrats isolants
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 23/538 - Dispositions pour conduire le courant électrique à l'intérieur du dispositif pendant son fonctionnement, d'un composant à un autre la structure d'interconnexion entre une pluralité de puces semi-conductrices se trouvant au-dessus ou à l'intérieur de substrats isolants
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
A smart cache implementation for image warping is provided by dividing an output image into a plurality of blocks corresponding to initial coordinates in the output image; dividing an input image into at least a first and second regions of pixels, where the first region overlaps the second region; generating an unsorted remap vector of the plurality of blocks for image warping the input image; identifying a first and second subsets of blocks from the plurality of blocks that can be reconstructed using the first and second regions respectively; generating a region-based sorting, a line-based sorting of the region-based sorting, a column-based sorting of the line-based sorting based on the initial x-coordinates of the blocks in the unsorted remap vector, and a sorted remap vector by sorting the column-based sorting based on initial y-coordinates of the blocks in the unsorted remap vector.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
Performance evaluation of a heterogeneous hardware platform includes implementing a traffic generator design in an integrated circuit. The traffic generator design includes traffic generator kernels including a traffic generator kernel implemented in a data processing array of the integrated circuit and a traffic generator kernel implemented in a programmable logic of the integrated circuit. The traffic generator design is executed in the integrated circuit. The traffic generator kernels implement data access patterns by, at least in part, generating dummy data. Performance data is generated from executing the traffic generator design in the integrated circuit. The performance data is output from the integrated circuit.
A method, system, and circuit arrangement involve synthesizing a circuit design specified in a register transfer level (RTL) specification into a netlist. The RTL specification includes an assert statement that specifies a conditional expression involving one or more signals specified in the circuit design to be checked during simulation, and the synthesizing includes synthesizing the assert statement into netlist elements. The design tool places and routes the netlist into a circuit design layout and generates implementation data from the layout.
G06F 30/327 - Synthèse logiqueSynthèse de comportement, p. ex. logique de correspondance, langage de description de matériel [HDL] à liste d’interconnections [Netlist], langage de haut niveau à langage de transfert entre registres [RTL] ou liste d’interconnections [Netlist]
G06F 30/31 - Saisie informatique, p. ex. éditeurs spécifiquement adaptés à la conception de circuits
G06F 30/323 - Traduction ou migration, p. ex. logique à logique, traduction de langage descriptif de matériel ou traduction de liste d’interconnections [Netlist]
93.
IIC WITH ADAPTIVE CHIP-TO-CHIP INTERFACE TO SUPPORT DIFFERENT CHIP-TO-CHIP PROTOCOLS
Embodiments herein describe using an adaptive chip-to-chip (C2C) interface to interconnect two chips, wherein the adaptive C2C interface includes circuitry for performing multiple different C2C protocols to communicate with the other chip. One or both of the chips in the C2C connection can include the adaptive C2C interface. During boot time, the adaptive C2C interface is configured to perform one of the different C2C protocols. During runtime, the chip then uses the selected C2C protocol to communicate with the other chip in the C2C connection.
High-level synthesis of designs using loop-aware execution information includes generating, using computer hardware, an intermediate representation (IR) of a design specified in a high-level programming language. The design is for an integrated circuit. Execution information analysis is performed on the IR of the design generating analysis results for functions of the design. The analysis results of the design are transformed by embedding the analysis results in a plurality of regions of the IR of the design. Selected regions of the plurality of regions are merged based on the analysis results, as embedded, for the selected regions. The IR of the design is scheduled using the analysis results subsequent to the merging.
G06F 30/323 - Traduction ou migration, p. ex. logique à logique, traduction de langage descriptif de matériel ou traduction de liste d’interconnections [Netlist]
95.
MULTI-DIE PHYSICALLY UNCLONABLE FUNCTION ENTROPY SOURCE
Disclosed circuit arrangements include a physically unclonable function (PUF) entropy source having passive circuit elements and active circuit elements. A first die has one or more metal layers and an active layer, and the passive circuit elements are disposed in the one or more metal layers. A second die has one or more metal layers and an active layer. The active circuit elements are coupled to the passive circuit elements and are disposed in the active layer of the second die, and the first die and the second die are in a stacked structure. The stacked structure has the one or more metal layers of the first die disposed between the active layer of the first die and the active layer of the second die.
H01L 23/00 - Détails de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide
H01L 25/065 - Ensembles consistant en une pluralité de dispositifs à semi-conducteurs ou d'autres dispositifs à l'état solide les dispositifs étant tous d'un type prévu dans une seule des sous-classes , , , , ou , p. ex. ensembles de diodes redresseuses les dispositifs n'ayant pas de conteneurs séparés les dispositifs étant d'un type prévu dans le groupe
96.
PROCESS AND TEMPERATURE TRACKING ON-CHIP SUPPLY REGULATION FOR LOW JITTER APPLICATIONS
On chip integrated circuit supply voltage regulator has a reference voltage that varies, based on process and temperature conditions of the integrated circuit. Supply voltage is boosted up if the active transistor load devices operate in a Slow-Slow process condition and/or temperature rises. Higher supply voltage improves the system performance (jitter/delay) if the load network includes switching components. If the active transistor load devices operate in a Fast-Fast process condition then the supply voltage is reduced without loss of performance and a savings in power. The variable reference voltage is generated based on process and temperature conditions of the semiconductor integrated circuit devices (transistors). The voltage regulator will automatically have its variable reference voltage adjusted based upon the process condition fabrication and temperature of the areas of the integrated circuit where the active transistor load devices are located.
G05F 1/56 - Régulation de la tension ou de l'intensité là où la variable effectivement régulée par le dispositif de réglage final est du type continu utilisant des dispositifs à semi-conducteurs en série avec la charge comme dispositifs de réglage final
97.
METHODS AND APPARATUSES FOR WAVELENGTH LOCKING FOR OPTICAL WAVELENGTH DIVISION MULTIPLEXED MICRO-RING MODULATORS
Some examples described herein provide for controlling output modulation amplitude for optoelectronic devices. In an example, a method includes transmitting a data pattern to an optical modulator device. The method also includes identifying, for each heater control value of a plurality of heater control values for a heater thermally coupled with the optical modulator device, an optical modulation amplitude corresponding to the heater control value based on a corresponding photodiode current value identified while transmitting the data pattern. The method also includes determining a maximum optical modulation amplitude for the optical modulator device based on a plurality of optical modulation amplitudes corresponding to the plurality of heater control values according to the identifying. The method also includes controlling the heater based at least in part on the determined maximum optical modulation amplitude that has been modified according to scaling maximum photodiode current values.
G02F 1/01 - Dispositifs ou dispositions pour la commande de l'intensité, de la couleur, de la phase, de la polarisation ou de la direction de la lumière arrivant d'une source lumineuse indépendante, p. ex. commutation, ouverture de porte ou modulationOptique non linéaire pour la commande de l'intensité, de la phase, de la polarisation ou de la couleur
G02F 1/015 - Dispositifs ou dispositions pour la commande de l'intensité, de la couleur, de la phase, de la polarisation ou de la direction de la lumière arrivant d'une source lumineuse indépendante, p. ex. commutation, ouverture de porte ou modulationOptique non linéaire pour la commande de l'intensité, de la phase, de la polarisation ou de la couleur basés sur des éléments à semi-conducteurs ayant des barrières de potentiel, p. ex. une jonction PN ou PIN
An integrated circuit (IC) device includes functional circuitry and data capture circuitry that stores a state of the functional circuitry in a buffer and outputs contents of the buffer to an external device based on a trigger. An embedded processor interacts with the functional circuitry based on a computer program, and initiates the trigger. The processor may initiate the trigger at a selectable break-point of the computer program and/or based on data generated by the functional circuitry. The processor may also output corresponding states of variables managed by the processor. The processor may initiate the trigger by asserting a predetermined value on a communication path between the processor and the functional circuitry, or over another communication path (e.g., an AXI debug hub) between the processor and the data capture circuitry. The processor may monitor/control the data capture circuitry through an API.
Embodiments herein describe techniques to extend a network-on-chip (NoC) across multiple IC dice in 3D. An integrated circuit (IC) device includes first and second vertically-stacked IC dice, and an inter-die bus that interfaces between the second die and a NoC packet switch (NPS) of the first die. The inter-die bus may include one or more driver circuits coupled to inter-die links of the inter-die bus. Communications over the inter-die links may be synchronous (e.g., packet-based) or asynchronous with the NPS (e.g., based on a point-to-point protocol, such as an AXI protocol). The inter-die bus may interface with a circuit block of the second IC device via a point-to-point (e.g., AXI) protocol or via a NPS of the second IC die. The IC device may include multiple inter-die buses, which may expand inter-die and intra-die routing options
A smart interrupt controller (SIC) routs an interrupt to a specific processor by dynamically changing the affinity of the interrupt based upon the processor power state and/or system load thereof. The SIC arbitrates interrupt servicing based on various parameters such as interrupt priority, interrupt affinity, processor load and processor power. Interrupt load sharing between selected processors increases overall computer system performance. Interrupt latency times decrease by avoiding unnecessary switching of processor power states from an inactive state to an active state by instead routing the interrupt to a different processor already in an active state. Interrupt latency times will decrease by routing the interrupt service request from a heavily loaded processor to one that is not so heavily loaded. Whereby active processor clock cycles are effectively utilized for interrupt servicing. Overall computer system power requirements will be reduced by eliminating unnecessary waking up of an inactive (sleeping) processor.