A processor includes a vector register file including vector registers, at least one buffer register, and a vector processing core to receive a vector instruction comprising a first identifier representing a first vector register of the vector registers, and a second identifier representing a second vector register of the vector registers, wherein the first vector register is a source register and the second vector register is a destination register, execute the vector instruction based on data values stored in the first vector register to generate a result and store the result in the at least one buffer register, and copy the result from the at least one buffer register to the second vector register.
A system and method relating to constructing an encoder and decoder neural network for providing semantic image segmentation includes generating an encoder comprising encoding convolution layers, each of the encoding convolution layers specifying an encoding filter operation using a respective first filter kernel, generating a decoder corresponding to the encoder, the decoder comprising decoding convolution layers, each of the decoding convolution layers being associated with a corresponding encoding convolution layer, and each of the decoding convolution layers specifying a decoding filter operation using a respective second filter kernel derived from the first filter kernel of the corresponding encoder convolution layer, and providing an input image to the encoder and the decoder for semantic image segmentation.
A processor includes a register file comprising a length register, a vector register file comprising a plurality of vector registers, a mask register file comprising a plurality of mask registers, and a vector instruction execution circuit to execute a masked vector instruction comprising a first length register identifier representing the length register, a first vector register identifier representing a first vector register of the vector register file, and a first mask register identifier representing a first mask register of the mask register file, wherein the length register is to store a length value representing a number of operations to be applied to data elements stored in the first vector register, the first mask register is to store a plurality of mask bits, and a first mask bit of the plurality of mask bits determines whether a corresponding first one of the operations causes an effect.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
4.
Device and method for hardware-efficient adaptive calculation of floating-point trigonometric functions using coordinate rotate digital computer (CORDIC)
A system and an accelerator circuit including a register file comprising instruction registers to store a trigonometric calculation instruction for evaluating a trigonometric function, and data registers comprising a first data register to store a floating-point input value associated with the trigonometric calculation instruction. The accelerator circuit further includes a determination circuit to identify the trigonometric calculation function and the floating-point input value associated with the trigonometric calculation instruction and determine whether the floating-point input value is in a small value range, and an approximation circuit to responsive to determining that the floating-point input value is in the small value, receive the floating-point input value and calculate an approximation of the trigonometric function with respect to the input value.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 5/01 - Procédés ou dispositions pour la conversion de données, sans modification de l'ordre ou du contenu des données maniées pour le décalage, p. ex. la justification, le changement d'échelle, la normalisation
G06F 7/544 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs n'établissant pas de contact, p. ex. tube, dispositif à l'état solideMéthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul
G06F 7/548 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs n'établissant pas de contact, p. ex. tube, dispositif à l'état solideMéthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul de fonctions trigonométriquesChangement de coordonnées
G06F 17/17 - Évaluation de fonctions par des procédés d'approximation, p. ex. par interpolation ou extrapolation, par lissage ou par le procédé des moindres carrés
5.
Device and method for calculating elementary functions using successive cumulative rotation circuit
A system and an accelerator circuit including a register file comprising instruction registers to store an instruction for evaluating an elementary function, and data registers comprising a first data register to store an input value. The accelerator circuit further includes a successive cumulative rotation circuit comprising a reconfigurable inner stage to perform a successive cumulative rotation recurrence, and a determination circuit to determine a type of the elementary function based on the instruction, and responsive to determining that the input value is a fixed-point number, configure the reconfigurable inner stage to a configuration for evaluating the type of the elementary function, wherein the successive cumulative rotation circuit is to calculate an evaluation of the elementary function using the reconfigurable inner stage performing the successive cumulative rotation recurrence.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 7/544 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs n'établissant pas de contact, p. ex. tube, dispositif à l'état solideMéthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul
G06F 7/548 - Méthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs n'établissant pas de contact, p. ex. tube, dispositif à l'état solideMéthodes ou dispositions pour effectuer des calculs en utilisant exclusivement une représentation numérique codée, p. ex. en utilisant une représentation binaire, ternaire, décimale utilisant des dispositifs non spécifiés pour l'évaluation de fonctions par calcul de fonctions trigonométriquesChangement de coordonnées
G06F 17/17 - Évaluation de fonctions par des procédés d'approximation, p. ex. par interpolation ou extrapolation, par lissage ou par le procédé des moindres carrés
G06F 5/01 - Procédés ou dispositions pour la conversion de données, sans modification de l'ordre ou du contenu des données maniées pour le décalage, p. ex. la justification, le changement d'échelle, la normalisation
6.
SYSTEM AND ARCHITECTURE NEURAL NETWORK ACCELERATOR INCLUDING FILTER CIRCUIT
A system and an accelerator circuit includes an internal memory to store data received a memory associated with a processor and a filter circuit block comprising a plurality of circuit stripes, each circuit stripe including a filter processor, a plurality of filter circuits, and a slice of the internal memory assigned to the plurality of filter circuits, where the filter processor is to execute a filter instruction to read data values from the internal memory based on a first memory address, for each of the plurality of circuit stripes: load the data values in weight registers and input registers associated with the plurality of filter circuits of the circuit stripe to generate a plurality of filter results, and write a result generated using the plurality of filter circuits in the internal memory at a second memory address.
A system and method include an accelerator circuit comprising an input circuit block, a filter circuit block, a post-processing circuit block, and an output circuit block and a processor to initialize the accelerator circuit, determining tasks of a neural network application to be performed by at least one of the input circuit block, the filter circuit block, the post-processing circuit block, or the output circuit block, assign each of the tasks to a corresponding one of the input circuit block, the filter circuit block, the post-processing circuit block, or the output circuit block, instruct the accelerator circuit to perform the tasks, and execute the neural network application based on results received from the accelerator circuit completing performance of the tasks.
A system includes a processor and an accelerator circuit including an input circuit block comprising an input processor to perform first tasks of the neural network application, a filter circuit block comprising a filter processor to perform second tasks of the neural network application, and a plurality of general-purpose filters communicatively coupled to the input circuit block, the filter circuit block, where the input circuit block and the filter circuit block form stages of an execution pipeline, a producer stage is to supply data values to a consumer stage, and operation of the consumer stage is on hold until a start flag stored in a first general-purpose register of the plurality of general-purpose registers to be set by the producer stage.
A washing machine including a rotatable cylinder comprising a washing chamber to hold washables, one or more sensors, and a processing device, communicatively connected to the one or more sensors to control an operation of the washing machine, to receive sensor data captured by the one or more sensors, determine, using a machine learning model based on the sensor data, a plurality of properties associated with the washables, determine a setting for the washing machine based on the plurality of properties, and cause the washing machine to operate according to the setting.
A light-emitting diode (LED) lighting device includes a white light source characterized by a general color rendering index (CRI) value and a first color-specific CRI value, and one or more LED elements of a color light within a wavelength band, wherein a combined light source comprising the white light source and the one or more LED elements is characterized by the general CRI value and a second color-specific CRI value, and the second color-specific CRI value is greater than the first color-specific CRI value.
G02F 1/01 - Dispositifs ou dispositions pour la commande de l'intensité, de la couleur, de la phase, de la polarisation ou de la direction de la lumière arrivant d'une source lumineuse indépendante, p. ex. commutation, ouverture de porte ou modulationOptique non linéaire pour la commande de l'intensité, de la phase, de la polarisation ou de la couleur
G02F 1/23 - Dispositifs ou dispositions pour la commande de l'intensité, de la couleur, de la phase, de la polarisation ou de la direction de la lumière arrivant d'une source lumineuse indépendante, p. ex. commutation, ouverture de porte ou modulationOptique non linéaire pour la commande de l'intensité, de la phase, de la polarisation ou de la couleur pour la commande de la couleur
11.
FEEDBACKWARD DECODER FOR PARAMETER EFFICIENT SEMANTIC IMAGE SEGMENTATION
A system and method relating to constructing an encoder and decoder neural network for providing semantic image segmentation includes generating an encoder comprising encoding convolution layers, each of the encoding convolution layers specifying an encoding filter operation using a respective first filter kernel, generating a decoder corresponding to the encoder, the decoder comprising decoding convolution layers, each of the decoding convolution layers being associated with a corresponding encoding convolution layer, and each of the decoding convolution layers specifying a decoding filter operation using a respective second filter kernel derived from the first filter kernel of the corresponding encoder convolution layer, and providing an input image to the encoder and the decoder for semantic image segmentation.
A system includes a memory, a processor, and an accelerator circuit. The accelerator circuit includes an internal memory, an input circuit block, a filter circuit block, a post-processing circuit block, and an output circuit block to concurrently perform tasks of a neural network application assigned to the accelerator circuit by the processor.
An intelligent light system installed on a motor vehicle includes a light source to provide illumination for the motor vehicle, wherein a wavelength of a light beam generated by the light source is adjustable, a plurality of sensors for capturing sensor data of an environment surrounding the motor vehicle, and a processing device to receive the sensor data captured by the plurality of sensors, provide the sensor data to a neural network to determine a first state of the environment, and issue a control signal to adjust the wavelength of the light beam based on the determined first state of the environment.
A system and an accelerator circuit including a register file comprising instruction registers to store an instruction for evaluating an elementary function, and data registers comprising a first data register to store an input value. The accelerator circuit further includes a successive cumulative rotation circuit comprising a reconfigurable inner stage to perform a successive cumulative rotation recurrence, and a determination circuit to determine a type of the elementary function based on the instruction, and responsive to determining that the input value is a fixed-point number, configure the reconfigurable inner stage to a configuration for evaluating the type of the elementary function, wherein the successive cumulative rotation circuit is to calculate an evaluation of the elementary function using the reconfigurable inner stage performing the successive cumulative rotation recurrence.
G06F 1/06 - Générateurs d'horloge produisant plusieurs signaux d'horloge
G06F 7/499 - Maniement de valeur ou d'exception, p. ex. arrondi ou dépassement
15.
DEVICE AND METHOD FOR HARDWARE-EFFICIENT ADAPTIVE CALCULATION OF FLOATING-POINT TRIGONOMETRIC FUNCTIONS USING COORDINATE ROTATE DIGITAL COMPUTER (CORDIC)
A system and an accelerator circuit including a register file comprising instruction registers to store a trigonometric calculation instruction for evaluating a trigonometric function, and data registers comprising a first data register to store a floating-point input value associated with the trigonometric calculation instruction. The accelerator circuit further includes a determination circuit to identify the trigonometric calculation function and the floating-point input value associated with the trigonometric calculation instruction and determine whether the floating-point input value is in a small value range, and an approximation circuit to responsive to determining that the floating-point input value is in the small value, receive the floating-point input value and calculate an approximation of the trigonometric function with respect to the input value.
A processor includes a register file comprising a length register, a vector register file comprising a plurality of vector registers, a mask register file comprising a plurality of mask registers, and a vector instruction execution circuit to execute a masked vector instruction comprising a first length register identifier representing the length register, a first vector register identifier representing a first vector register of the vector register file, and a first mask register identifier representing a first mask register of the mask register file, wherein the length register is to store a length value representing a number of operations to be applied to data elements stored in the first vector register, the first mask register is to store a plurality of mask bits, and a first mask bit of the plurality of mask bits determines whether a corresponding first one of the operations causes an effect.
An anti-collision system and method of a vehicle including a first sensor device to capture first sensor data associated with a first vehicle in front of the vehicle, a second sensor device to capture second sensor data associated with a second vehicle behind the vehicle, and a processing device to calculate, based on the first sensor data, a plurality of first parameters characterizing the first vehicle, calculate, based on the second sensor data, a plurality of second parameters characterizing the second vehicle, responsive to detecting a braking event by the first vehicle, determine, based on a rule taking into consideration at least one of the plurality of first parameters and at least one of the plurality of second parameters, a braking force for the vehicle, and generate a braking control signal that applies the braking force to brakes of the vehicle.
A washing machine including a rotatable cylinder comprising a washing chamber to hold washables, one or more sensors, and a processing device, communicatively connected to the one or more sensors to control an operation of the washing machine, to receive sensor data captured by the one or more sensors, determine, using a machine learning model based on the sensor data, a plurality of properties associated with the washables, determine a setting for the washing machine based on the plurality of properties, and cause the washing machine to operate according to the setting.
A processor including a vector register file comprising a plurality of vector registers, at least one buffer register, and a vector processing core, communicatively connected to the vector register file and the at least one buffer register, to receive a vector instruction comprising a first identifier representing a first vector register of the plurality of vector registers, and a second identifier representing a second vector register of the plurality of vector registers, wherein the first vector register is a source register and the second vector register is a destination register, execute the vector instruction based on data values stored in the first vector register to generate a result and store the result in the at least one buffer register, and responsive to determining that the second vector register is safe to write, copy the result from the at least one buffer register to the second vector register.
A system and method relating to object detection may include receiving an image frame comprising an array of pixels captured by an image sensor associated with the processing device, identifying a near-field image segment and a far-field image segment in the image frame, applying a first neural network trained for near-field image segments to the near-field image segment for detecting the objects presented in the near-field image segment, and applying a second neural network trained for far-field image segments to the far-field image segment for detecting the objects presented in the near-field image segment.
G06K 9/00 - Méthodes ou dispositions pour la lecture ou la reconnaissance de caractères imprimés ou écrits ou pour la reconnaissance de formes, p.ex. d'empreintes digitales
21.
OBJECT DETECTION USING MULTIPLE SENSORS AND REDUCED COMPLEXITY NEURAL NETWORKS
A system and method relating to object detection using multiple sensor devices include receiving a range data comprising a plurality of points, each of plurality of points being associated with an intensity value and a depth value, determining, based on the intensity values and depth values of the plurality of points, a bounding box surrounding a cluster of points among the plurality of points, receiving a video image comprising an array of pixels, determining a region in the video image corresponding to the bounding box, and applying a first neural network to the region to determine an object captured by the range data and the video image.
A system and method to operate an autonomous vehicle on the road. The system and method may include determining a lane area on a road, calculating a first position within the lane area, determining a tolerance region within the lane area, calculating a deviation offset based on the tolerance region, calculating a second position based on the first position and the deviation offset, and causing to operate the autonomous vehicle to travel to the second position.
A system includes a memory, a processor, and an accelerator circuit. The accelerator circuit includes an internal memory, an input circuit block, a filter circuit block, a post-processing circuit block, and an output circuit block to concurrently perform tasks of a neural network application assigned to the accelerator circuit by the processor.
A processor includes a translation lookaside buffer (TLB) comprising a plurality of ways, wherein each way is associated with a respective page size, and a processing core, communicatively coupled to the TLB, to execute an instruction associated with a virtual memory page, identify a first way of the plurality of ways, wherein the first way is associated with a first page size, determine an index value using the virtual, memory page and the first page size for the first way, determine, using the index value, a first TLB entry of the first way, and translate, using a memory address translation stored in the first TLB entry, the first virtual memory page to a first physical memory page.
A processor includes a translation lookaside buffer (TLB) comprising a plurality of ways, wherein each way is associated with a respective page size, and a processing core, communicatively coupled to the TLB, to execute an instruction associated with a virtual memory page, identify a first way of the plurality of ways, wherein the first way is associated with a first page size, determine an index value using the virtual memory page and the first page size for the first way, determine, using the index value, a first TLB entry of the first way, and translate, using a memory address translation stored in the first TLB entry, the first virtual memory page to a first physical memory page.
G06F 12/1027 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB]
G06F 12/1036 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB] pour espaces adresse virtuels multiples, p. ex. segmentation
G06F 12/1009 - Traduction d'adresses avec tables de pages, p. ex. structures de table de page
G06F 12/0864 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache utilisant des moyens pseudo-associatifs, p. ex. associatifs d’ensemble ou de hachage
26.
IMPLEMENTATION OF REGISTER RENAMING, CALL-RETURN PREDICTION AND PREFETCH
A processor includes a plurality of physical registers and a processor core, communicatively coupled to the plurality of physical registers, the processor core to execute a process comprising a plurality of instructions to responsive to issuance of a call instruction for out-of-order execution, identify, based on a head pointer of the plurality of physical registers, a first physical register of the plurality of physical registers, store a return address in the first physical register, wherein the first physical register is associated with a first identifier, store, based on an out-of-order pointer of a call stack associated with the process, the first identifier in a first entry of the call stack, and increment, modulated by a length of the call stack, the out-of-order pointer of the call stack to point to a second entry of the call stack.
A processor including a first storage to store a first data item, a second storage, and an execution unit comprising a logic circuit encoding an instruction, the instruction comprising a first field to store an identifier of the first storage, a second field to store an identifier of the second storage, and a third field to store an identifier representing a rounding rule, wherein the execution unit is to execute the instruction to generate a second data item based on the first data item, round the second data item according to the rounding rule specified by the instruction, and store the rounded second data item in the second storage.
A processor comprising a cache, the cache comprising a cache line, an execution unit to execute an atomic primitive to responsive to executing a read instruction to retrieve a data item from a memory location, cause to store a copy of the data item in the cache line, execute a lock instruction to lock the cache line to the processor, execute at least one instruction while the cache line is locked to the processor, and execute an unlock instruction to cause the cache controller to release the cache line from the processor.
A processor comprising a cache, the cache comprising a cache line, an execution unit to execute an atomic primitive to responsive to executing a read instruction to retrieve a data item from a memory location, cause to store a copy of the data item in the cache line, execute a lock instruction to lock the cache line to the processor, execute at least one instruction while the cache line is locked to the processor, and execute an unlock instruction to cause the cache controller to release the cache line from the processor.
G06F 12/0817 - Protocoles de cohérence de mémoire cache à l’aide de méthodes de répertoire
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 12/084 - Systèmes de mémoire cache multi-utilisateurs, multiprocesseurs ou multitraitement avec mémoire cache partagée
G06F 12/0895 - Mémoires cache caractérisées par leur organisation ou leur structure de parties de mémoires cache, p. ex. répertoire ou matrice d’étiquettes
G06F 9/52 - Synchronisation de programmesExclusion mutuelle, p. ex. au moyen de sémaphores
A processor including a first storage to store a first data item, a second storage, and an execution unit comprising a logic circuit encoding an instruction, the instruction comprising a first field to store an identifier of the first storage, a second field to store an identifier of the second storage, and a third field to store an identifier representing a rounding rule, wherein the execution unit is to execute the instruction to generate a second data item based on the first data item, round the second data item according to the rounding rule specified by the instruction, and store the rounded second data item in the second storage.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 7/483 - Calculs avec des nombres représentés par une combinaison non linéaire de nombres codés, p. ex. nombres rationnels, système de numération logarithmique ou nombres à virgule flottante
G06F 12/08 - Adressage ou affectationRéadressage dans des systèmes de mémoires hiérarchiques, p. ex. des systèmes de mémoire virtuelle
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
A system and method relate to calculating a first edge map associated with a reference video frame, generating a second edge map associated with an incoming video frame, generating an offset between the reference video frame and the video frame based on a first frequency domain representation of the first edge map and a second frequency domain representation of the second edge map, translating locations of a plurality of pixels of the incoming video frame according to the calculated offset to align the incoming video frame with respect to the reference video frame, and transmitting the aligned video frame to a downstream device.
A system and method relate to calculating a first edge map associated with a reference video frame, generating a second edge map associated with an incoming video frame, generating an offset between the reference video frame and the video frame based on a first frequency domain representation of the first edge map and a second frequency domain representation of the second edge map, translating locations of a plurality of pixels of the incoming video frame according to the calculated offset to align the incoming video frame with respect to the reference video frame, and transmitting the aligned video frame to a downstream device.
A monolithic dual band antenna is provided. The monolithic dual band antenna includes a first layer comprising a high frequency band antenna. The monolithic dual band antenna further includes a second layer underlying the first layer. The second layer includes a low frequency band antenna. The geometry of the high frequency antenna relative to the low frequency antenna causes resulting electric fields of the high frequency band antenna to be orthogonal to the resulting electric fields of the low frequency band antenna.
A computer processor may include a plurality of hardware threads. The computer processor may further include state processor logic for a state of a hardware thread. The state processor logic may include per thread logic that contains state that is replicated in each hardware thread of the plurality of hardware threads and common logic that is independent of each hardware thread of the plurality of hardware threads. The computer processor may further include single threaded mode logic to execute instructions in a single threaded mode from only one hardware thread of the plurality of hardware threads. The computer processor may further include second mode logic to execute instructions in a second mode from more than one hardware thread of the plurality of hardware threads simultaneously. The computer processor may further include switching mode logic to switch between the first mode and the second mode.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 9/46 - Dispositions pour la multiprogrammation
G06F 9/48 - Lancement de programmes Commutation de programmes, p. ex. par interruption
G06F 9/455 - ÉmulationInterprétationSimulation de logiciel, p. ex. virtualisation ou émulation des moteurs d’exécution d’applications ou de systèmes d’exploitation
A computer processor may include a plurality of hardware threads. The computer processor may further include state processor logic for a state of a hardware thread. The state processor logic may include per thread logic that contains state that is replicated in each hardware thread of the plurality of hardware threads and common logic that is independent of each hardware thread of the plurality of hardware threads. The computer processor may further include single threaded mode logic to execute instructions in a single threaded mode from only one hardware thread of the plurality of hardware threads. The computer processor may further include second mode logic to execute instructions in a second mode from more than one hardware thread of the plurality of hardware threads simultaneously. The computer processor may further include witching mode logic to switch between the first mode and the second mode.
A computer processor with an address register file is disclosed. The computer processor may include a memory. The computer processor may further include a general purpose register file comprising at least one general purpose register. The computer processor may further include an address register file comprising at least one address register. The computer processor may further include having access to the memory, the general purpose register file, and the address register file. The processing logic may execute a memory access instruction that accesses one or more memory locations in the memory at one or more corresponding addresses computed by retrieving the value of an address register of the at least one register of the address register file specified in the instruction and adding a displacement value encoded in the instruction.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 12/0875 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec mémoire cache dédiée, p. ex. instruction ou pile
G06F 12/0893 - Mémoires cache caractérisées par leur organisation ou leur structure
G06F 12/1009 - Traduction d'adresses avec tables de pages, p. ex. structures de table de page
G06F 9/32 - Formation de l'adresse de l'instruction suivante, p. ex. par incrémentation du compteur ordinal
G06F 12/0862 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec pré-lecture
G06F 12/1027 - Traduction d'adresses utilisant des moyens de traduction d’adresse associatifs ou pseudo-associatifs, p. ex. un répertoire de pages actives [TLB]
G06F 3/06 - Entrée numérique à partir de, ou sortie numérique vers des supports d'enregistrement
37.
Computer processor that implements pre-translation of virtual addresses
A computer processor that implements pre-translation of virtual addresses is disclosed. The computer processor may include a register file comprising one or more registers. The computer processor may include processing logic. The processing logic may receive a value to store in a register of one or more registers. The processing logic may store the value in the register. The processing logic may designate the received value as a virtual address, the virtual address having a corresponding virtual base page number. The processing logic may translate the virtual base page number to a corresponding real base page number and zero or more real page numbers corresponding to zero or more virtual page numbers adjacent to the virtual base page number. The processing logic may further store in the register of the one or more registers the real base page number and the zero or more real page numbers.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 3/06 - Entrée numérique à partir de, ou sortie numérique vers des supports d'enregistrement
G06F 12/0875 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec mémoire cache dédiée, p. ex. instruction ou pile
G06F 12/0893 - Mémoires cache caractérisées par leur organisation ou leur structure
G06F 12/1009 - Traduction d'adresses avec tables de pages, p. ex. structures de table de page
G06F 12/0862 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec pré-lecture
G06F 9/32 - Formation de l'adresse de l'instruction suivante, p. ex. par incrémentation du compteur ordinal
A computer processor with register direct branches and employing an instruction preload structure is disclosed. The computer processor may include a hierarchy of memories comprising a first memory organized in a structure having one or more entries for one or more addresses corresponding to one or more instructions. The one or more entries of the one or more addresses may have a starting address. The structure may have one or more locations for storing the one or more instructions. The computer processor may include one or more registers to which one or more corresponding instruction addresses are writable. The computer processor may include processing logic. In response to the processing logic writing the one or more instruction addresses to the one or more registers, the processing logic may to pre-fetch the one or more instructions of a linear sequence of instructions from a first memory level of the hierarchy of memories into a second memory level of the hierarchy of memories beginning at the starting address. At least one address of the one or more addresses may be the contents of a register of the one or more registers.
G06F 12/08 - Adressage ou affectationRéadressage dans des systèmes de mémoires hiérarchiques, p. ex. des systèmes de mémoire virtuelle
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 3/06 - Entrée numérique à partir de, ou sortie numérique vers des supports d'enregistrement
G06F 12/0875 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec mémoire cache dédiée, p. ex. instruction ou pile
G06F 12/0893 - Mémoires cache caractérisées par leur organisation ou leur structure
G06F 12/1009 - Traduction d'adresses avec tables de pages, p. ex. structures de table de page
G06F 12/0862 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec pré-lecture
G06F 9/32 - Formation de l'adresse de l'instruction suivante, p. ex. par incrémentation du compteur ordinal
A computer processor that implements pre-translation of virtual addresses with target registers is disclosed. The computer processor may include a register file comprising one or more registers. The computer processor may include processing logic. The processing logic may receive a value to store in a register of one or more registers. The processing logic may store the value in the register. The processing logic may designate the received value as a virtual instruction address, the virtual instruction address having a corresponding virtual base page number. The processing logic may translate the virtual base page number to a corresponding real base page number and zero or more real page numbers corresponding to zero or more virtual page numbers adjacent to the virtual base page number. The processing logic may further store in the register of the one or more registers the real base page number and the zero or more real page numbers.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 3/06 - Entrée numérique à partir de, ou sortie numérique vers des supports d'enregistrement
G06F 12/0875 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec mémoire cache dédiée, p. ex. instruction ou pile
G06F 12/0893 - Mémoires cache caractérisées par leur organisation ou leur structure
G06F 12/1009 - Traduction d'adresses avec tables de pages, p. ex. structures de table de page
G06F 12/0862 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p. ex. mémoires cache avec pré-lecture
G06F 9/32 - Formation de l'adresse de l'instruction suivante, p. ex. par incrémentation du compteur ordinal
A computer processor with an address register file is disclosed. The computer processor may include a memory. The computer processor may further include a general purpose register file comprising at least one general purpose register. The computer processor may further include an address register file comprising at least one address register. The computer processor may further include having access to the memory, the general purpose register file, and the address register file. The processing logic may execute a memory access instruction that accesses one or more memory locations in the memory at one or more corresponding addresses computed by retrieving the value of an address register of the at least one register of the address register file specified in the instruction and adding a displacement value encoded in the instruction.
A computer processor is disclosed. The computer processor may comprises a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more instructions that produce results with elements of widths different than that of the input elements. The computer processor may be implemented as a monolithic integrated circuit.
A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more instructions that separate a vector or combine two vectors. The computer processor may be implemented as a monolithic integrated circuit.
A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising one or more registers to hold a varying number of elements. The computer processor may further comprise processing logic configured to implicitly type each of the varying number of elements in the vector register file. The computer processor may be implemented as a monolithic integrated circuit.
A computer processor is disclosed. The computer processor comprises one or more processor resources. The computer processor further comprises a plurality of hardware thread units coupled to the one or more processor resources. The computer processor may be configured to permit simultaneous access to the one or more processor resources by only a subset of hardware thread units of the plurality of hardware thread units. The number of hardware threads in the subset may be less than the total number of hardware threads of the plurality of hardware thread units.
A computer processor is disclosed. The computer processor comprises a vector unit comprising a vector register file comprising at least one vector register to hold a varying number of elements. The number of architected vector registers in the vector register file differs from the number of physical vector registers in the vector register file.
A computer processor comprising a vector unit is disclosed. The vector unit may comprise a vector register file comprising at least one register to hold a varying number of elements. The vector unit may further comprise a vector length register file comprising at least one register to specify the number of operations of a vector instruction to be performed on the varying number of elements in the at least one register of the vector register file. The computer processor may be implemented as a monolithic integrated circuit.
A computer processor is disclosed. The computer processor comprises a vector unit comprising a vector register file comprising at least one vector register to hold a varying number of elements. The computer processor further comprises out-of-order issue logic that holds a pool of vector instructions, selects a vector instruction from the pool, and sends the vector instruction for execution. The vector instruction operates on the varying number of elements of the at least one vector register.
A computer processor is disclosed. The computer processor comprises a vector unit comprising a vector register file comprising one or more registers to hold a varying number of elements. The computer processor further comprises processing logic configured to operate on the varying number of elements in the vector register file using one or more digital signal processing instructions. The computer processor may be implemented as a monolithic integrated circuit.
A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more graphics processing instructions. The computer processor may be implemented as a monolithic integrated circuit.
A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more complex arithmetic instructions. The computer processor may be implemented as a monolithic integrated circuit.
A computer processor is disclosed. The computer processor comprises one or more processor resources. The computer processor further comprises a plurality of hardware thread units coupled to the one or more processor resources. The computer processor may be configured to permit simultaneous access to the one or more processor resources by only a subset of hardware thread units of the plurality of hardware thread units. The number of hardware threads in the subset may be less than the total number of hardware threads of the plurality of hardware thread units.
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 9/46 - Dispositions pour la multiprogrammation
G06F 15/76 - Architectures de calculateurs universels à programmes enregistrés
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
52.
Vector processor to operate on variable length vectors using graphics processing instructions
A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more graphics processing instructions. The computer processor may be implemented as a monolithic integrated circuit.
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
53.
Vector processor to operate on variable length vectors with out-of-order execution
A computer processor is disclosed. The computer processor comprise a vector unit comprising a vector register file comprising at least one vector register to hold a varying number of elements. The computer processor further comprises out-of-order issue logic that holds a pool of vector instructions, selects a vector instruction from the pool, and sends the vector instruction for execution. The vector instruction operates on the varying number of elements of the at least one vector register.
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
54.
Vector processor configured to operate on variable length vectors using one or more complex arithmetic instructions
A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more complex arithmetic instructions. The computer processor may be implemented as a monolithic integrated circuit.
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 9/46 - Dispositions pour la multiprogrammation
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
55.
Vector processor configured to operate on variable length vectors using instructions to combine and split vectors
A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more instructions that separate a vector or combine two vectors. The computer processor may be implemented as a monolithic integrated circuit.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
56.
Vector processor configured to operate on variable length vectors using implicitly typed instructions
A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising one or more registers to hold a varying number of elements. The computer processor may further comprise processing logic configured to implicitly type each of the varying number of elements in the vector register file. The computer processor may be implemented as a monolithic integrated circuit.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
57.
Monolithic vector processor configured to operate on variable length vectors using a vector length register
A computer processor comprising a vector unit is disclosed. The vector unit may comprise a vector register file comprising at least one register to hold a varying number of elements. The vector unit may further comprise a vector length register file comprising at least one register to specify the number of operations of a vector instruction to be performed on the varying number of elements in the at least one register of the vector register file. The computer processor may be implemented as a monolithic integrated circuit.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
G06F 9/46 - Dispositions pour la multiprogrammation
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
58.
Vector processor configured to operate on variable length vectors using digital signal processing instructions
A computer processor is disclosed. The computer processor comprises a vector unit comprising a vector register file comprising one or more registers to hold a varying number of elements. The computer processor further comprises processing logic configured to operate on the varying number of elements in the vector register file using one or more digital signal processing instructions. The computer processor may be implemented as a monolithic integrated circuit.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
59.
Vector processor configured to operate on variable length vectors using instructions that change element widths
A computer processor is disclosed. The computer processor may comprises a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more instructions that produce results with elements of widths different than that of the input elements. The computer processor may be implemented as a monolithic integrated circuit.
G06F 9/30 - Dispositions pour exécuter des instructions machines, p. ex. décodage d'instructions
G06F 9/38 - Exécution simultanée d'instructions, p. ex. pipeline ou lecture en mémoire
G06F 15/78 - Architectures de calculateurs universels à programmes enregistrés comprenant une seule unité centrale
G06F 15/80 - Architectures de calculateurs universels à programmes enregistrés comprenant un ensemble d'unités de traitement à commande commune, p. ex. plusieurs processeurs de données à instruction unique
G06F 17/14 - Transformations de Fourier, de Walsh ou transformations d'espace analogues
G06F 9/46 - Dispositions pour la multiprogrammation
60.
OPPORTUNITY MULTITHREADING IN A MULTITHREADED PROCESSOR WITH INSTRUCTION CHAINING CAPABILITY
A computing device determines that a current software thread of a plurality of software threads having an issuing sequence does not have a first instruction waiting to be issued to a hardware thread during a clock cycle. The computing device identifies one or more alternative software threads in the issuing sequence having instructions waiting to be issued. The computing device selects, during the clock cycle by the computing device, a second instruction from a second software thread among the one or more alternative software threads in view of determining that the second instruction has no dependencies with any other instructions among the instructions waiting to be issued. Dependencies are identified by the computing device in view of the values of a chaining bit extracted from each of the instructions waiting to be issued. The computing device issues the second instruction to the hardware thread.
A chaining bit decoder of a computer processor receives an instruction stream. The chaining bit decoder selects a group of instructions from the instruction stream. The chaining bit decoder extracts a designated bit from each instruction of the instruction stream to produce a sequence of chaining bits. The chaining bit decoder decodes the sequence of chaining bits. The chaining bit decoder identifies zero or more instruction stream dependencies among the selected group of instructions in view of the decoded sequence of chaining bits. The chaining bit decoder outputs control signals to cause one or more pipelines stages of the processor to execute the selected group of instructions in view of the identified zero or more instruction stream dependencies among the group sequence of instructions.
A processing device identifies a set of software threads having instructions waiting to issue. For each software thread in the set of the software threads, the processing device binds the software thread to an available hardware context in a set of hardware contexts and stores an identifier of the available hardware context bound to the software thread to a next available entry in an ordered list. The processing device reads an identifier stored in an entry of the ordered list. Responsive to an instruction associated with the identifier having no dependencies with any other instructions among the instructions waiting to issue, the processing device issues the instruction waiting to issue to the hardware context associated with the identifier.
A computing device determines that a current software thread of a plurality of software threads having an issuing sequence does not have a first instruction waiting to be issued to a hardware thread during a clock cycle. The computing device identifies one or more alternative software threads in the issuing sequence having instructions waiting to be issued. The computing device selects, during the clock cycle by the computing device, a second instruction from a second software thread among the one or more alternative software threads in view of determining that the second instruction has no dependencies with any other instructions among the instructions waiting to be issued. Dependencies are identified by the computing device in view of the values of a chaining bit extracted from each of the instructions waiting to be issued. The computing device issues the second instruction to the hardware thread.
A chaining bit decoder of a computer processor receives an instruction stream. The chaining bit decoder selects a group of instructions from the instruction stream. The chaining bit decoder extracts a designated bit from each instruction of the instruction stream to produce a sequence of chaining bits. The chaining bit decoder decodes the sequence of chaining bits. The chaining bit decoder identifies zero or more instruction stream dependencies among the selected group of instructions in view of the decoded sequence of chaining bits. The chaining bit decoder outputs control signals to cause one or more pipelines stages of the processor to execute the selected group of instructions in view of the identified zero or more instruction stream dependencies among the group sequence of instructions.
A processing device identifies a set of software threads having instructions waiting to issue. For each software thread in the set of the software threads, the processing device binds the software thread to an available hardware context in a set of hardware contexts and stores an identifier of the available hardware context bound to the software thread to a next available entry in an ordered list. The processing device reads an identifier stored in an entry of the ordered list. Responsive to an instruction associated with the identifier having no dependencies with any other instructions among the instructions waiting to issue, the processing device issues the instruction waiting to issue to the hardware context associated with the identifier.