A method of operating a personal intelligent agent in an ambient computing environment, comprising receiving input; analyzing input to derive a user personal preference; associating the personal preference with a first context indicator; determining whether the personal preference is exposable; responsive to determining that the personal preference is exposable, storing the preference with the associated context indicator; detecting when the agent enters a detectable context and responsively creating a second context indicator; determining if there is a match between the second and the first context indicator; retrieving the exposable personal preference associated with the context indicator; creating an anonymous preference indicator comprising the exposable personal preference with the matched context; emitting the preference indicator over the ambient computing environment; and monitoring the ambient computing environment to detect any broadcast message indicating ability to satisfy the preference shown in the preference indicator.
The present disclosure relates to a data processor for processing data, comprising: a plurality of execution units to execute one or more operations; and a plurality of storage elements to store data for the one or more operations, the data processor being configured to process at least one task, each task to be executed in the form of a directed acyclic graph of operations, wherein each of the operations maps to a corresponding execution unit and each connection between operations in the acyclic graph maps to a corresponding storage element, the data processor further comprising: a plurality of counters; and a control module to control the plurality of counters to: in a first mode, count an operation cycle number associated with each operation of the at least one task, the operation cycle number of an operation being a number of cycles required to complete the operation; and in a second mode, count a unit cycle number associated with one or more execution units, the unit cycle number of an execution unit being an accumulative number of cycles when the execution unit is occupied in use during execution of the at least one task.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; semiconductors; system-on-chip devices; microprocessors; processors [central processing units]; microprocessors in the field of artificial intelligence; neural network processors; electronic chips; application-specific integrated circuits; graphics processing units; semiconductor intellectual property cores; computer interfaces, namely instruction set architectures; printed circuit boards; computer software for integrated circuits; cloud computing software for semiconductors; downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards. Design of semiconductors, microprocessors, system-on-chip devices, processors [central processing units], chips [integrated circuits], application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; semiconductors; system-on-chip devices; microprocessors; processors [central processing units]; microprocessors in the field of artificial intelligence; neural network processors; electronic chips; application-specific integrated circuits; graphics processing units; semiconductor intellectual property cores; computer interfaces, namely instruction set architectures; printed circuit boards; semiconductor devices featuring technology for automotive applications; computer software for integrated circuits; downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards. Design of semiconductors, microprocessors, system-on-chip devices, processors [central processing units], chips [integrated circuits], application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; semiconductor design for automotive technology; research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; Semiconductors; System-on-chip devices; Microprocessors; Processors central processing units; Microprocessors in the field of artificial intelligence; Neural network processors; Electronic chips; Application-specific integrated circuits; Graphics processing units; Semiconductor intellectual property cores; Computer interfaces, namely instruction set architectures; Printed circuit boards; Semiconductors for use in computers and laptop devices; Computer software for integrated circuits; Downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; Electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards. Design of semiconductors, microprocessors, system-on-chip devices, processors central processing units, chips, integrated circuits, application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; Research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; Research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; Semiconductors; System-on-chip devices; Microprocessors; Processors central processing units; Microprocessors in the field of artificial intelligence; Neural network processors; Electronic chips; Application-specific integrated circuits; Graphics processing units; Semiconductor intellectual property cores; Computer interfaces, namely instruction set architectures; printed circuit boards; Semiconductors, microprocessors, and microprocessors for Internet of Things (IOT) devices; Computer software for integrated circuits; Downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; Electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards Design of semiconductors, microprocessors, system-on-chip devices, processors central processing units, chips, integrated circuits, application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; Research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; Research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; Semiconductors; System-on-chip devices; Microprocessors; Processors central processing units; Microprocessors in the field of artificial intelligence; Neural network processors; Electronic chips; Application-specific integrated circuits; Graphics processing units; Semiconductor intellectual property cores; Computer interfaces, namely instruction set architectures; Printed circuit boards; Semiconductors for use in automotive applications; Computer software for integrated circuits; Downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; Electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards Design of semiconductors, microprocessors, system-on-chip devices, processors central processing units, chips integrated circuits, application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; Design of semiconductor chips and components for automotive applications; Research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; Semiconductors; System-on-chip devices; Microprocessors; Processors central processing units; Microprocessors in the field of artificial intelligence; Neural network processors; Electronic chips; Application-specific integrated circuits; Graphics processing units; Semiconductor intellectual property cores; Computer interfaces, namely instruction set architectures; Printed circuit boards; Computer software for integrated circuits; Semiconductors for use in handheld and mobile devices; Downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; Electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards Design of semiconductors, microprocessors, system-on-chip devices, processors central processing units, chips integrated circuits, application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; Research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; Research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors
9.
DEVICE PERMISSIONS TABLE DEFINING PERMISSIONS INFORMATION FOR A TRANSLATED ACCESS REQUEST
Apparatus, method and code for fabrication of an apparatus. The apparatus comprises address translation circuitry (116) to translate virtual addresses to physical addresses in response to advance address translation requests issued by devices (105) on behalf of software contexts (125). The apparatus also comprises translated access control circuitry (117) to control access to memory (110) in response to translated access requests issued by the devices (105) on behalf of the software contexts (125), based on permissions information defined in a device permission table (220), wherein the corresponding access permissions provide information for checking whether translated access requests from a plurality of software contexts are prohibited.
A spiking neural network is described that comprises a plurality of neurons in a first layer connected to at least one neuron in a second layer, each neuron in the first layer being connected to the at least one neuron in the second layer via a respective variable delay path. The at least one neuron in the second layer comprises one or more logic components configured to generate an output signal in dependence upon signals received along the variable delay paths from the plurality of neurons in the first layer. A timing component is configured to determine a timing value in response to receiving the output signal from the one or more logic components, and an accumulate component is configured to accumulate a value based timing values from the timing component. A neuron fires in a case that a value accumulated at the accumulate component reaches a threshold value.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; semiconductors; system-on-chip devices; microprocessors; processors [central processing units]; microprocessors in the field of artificial intelligence; neural network processors; electronic chips; application-specific integrated circuits; graphics processing units; semiconductor intellectual property cores; computer interfaces, namely instruction set architectures; printed circuit boards; semiconductors for computers and laptop devices; computer software for integrated circuits; downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards. Design of semiconductors, microprocessors, system-on-chip devices, processors [central processing units], chips [integrated circuits], application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; semiconductors; system-on-chip devices; microprocessors; processors [central processing units]; microprocessors in the field of artificial intelligence; neural network processors; electronic chips; application-specific integrated circuits; graphics processing units; semiconductor intellectual property cores; computer interfaces, namely instruction set architectures; printed circuit boards; computer software for integrated circuits; semiconductors for handheld and mobile devices; downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards. Design of semiconductors, microprocessors, system-on-chip devices, processors [central processing units], chips [integrated circuits], application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; semiconductors; system-on-chip devices; microprocessors; processors [central processing units]; microprocessors in the field of artificial intelligence; neural network processors; electronic chips; application-specific integrated circuits; graphics processing units; semiconductor intellectual property cores; computer interfaces, namely instruction set architectures; printed circuit boards; semiconductors, microprocessors for Internet of Things (IOT) devices; computer software for integrated circuits; downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards. Design of semiconductors, microprocessors, system-on-chip devices, processors [central processing units], chips [integrated circuits], application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Integrated circuits; Semiconductors; System-on-chip devices; Microprocessors; Processors central processing units; Microprocessors in the field of artificial intelligence; Neural network processors; Electronic chips; Application-specific integrated circuits; Graphics processing units; Semiconductor intellectual property cores; Computer interfaces, namely instruction set architectures; Printed circuit boards; Downloadable computer software for integrated circuits; Downloadable cloud computing software for semiconductors; Downloadable computer operating software and libraries for machine learning, deep learning, and artificial intelligence hardware platforms in the field of semiconductors; Electronic downloadable materials, namely, electronic downloadable instruction and development manuals, datasheets and brochures, all in the area of design and development of integrated circuits, microprocessors, microprocessor cores, macro cells, microcontrollers, bus interfaces, and printed circuit boards Design of semiconductors, microprocessors, system-on-chip devices, processors central processing units, chips, integrated circuits, application-specific integrated circuits, graphics processing units, machine learning processors and semiconductor cores; Research, development, and design relating to computer hardware for semiconductor intellectual property, instruction set architectures, microprocessors; Research, development and design, all relating to computer software used in, and for use in the design, verification and construction of microprocessors, processors, microcontrollers, microprocessor design files, semiconductor intellectual property cores, computer hardware accelerators, neural network processors and machine learning processors
A graphics processing system that is operable to perform ray tracing using micromaps is disclosed. A tree representation of a micromap is generated, and when it is desired to determine whether and/or how a ray interacts with a sub-region of a primitive, the tree representation of the micromap is traversed to determine a property value for the sub-region of the primitive.
An apparatus is described having processing circuitry to perform vector processing operations, a set of vector registers, and an instruction decoder to decode vector instructions to control the processing circuitry to perform the required operations. The instruction decoder is responsive to a given vector memory access instruction specifying a plurality of memory access operations, where each memory access operation is to be performed to access an associated data element, to determine, from a data vector indication field of the given vector memory access instruction, at least one vector register in the set of vector registers associated with a plurality of data elements, and to determine, from at least one capability vector indication field of the given vector memory access instruction, a plurality of vector registers in the set of vector registers containing a plurality of capabilities. Each capability is associated with one of the data elements in the plurality of data elements and provides an address indication and constraining information constraining use of that address indication when accessing memory. The number of vector registers determined from the at least one capability vector indication field is greater than the number of vector registers determined from the data vector indication field. The instruction decoder controls the processing circuitry: to determine, for each given data element in the plurality of data elements, a memory address based on the address indication provided by the associated capability, and to determine whether the memory access operation to be used to access the given data element is allowed in respect of that determined memory address having regard to the constraining information of the associated capability; and to enable performance of the memory access operation for each data element for which the memory access operation is allowed.
An apparatus has processing circuitry (16) to perform data processing, and instruction decoding circuitry (10) to control the processing circuitry to perform the data processing in response to decoding of program instructions defined according to a scalable vector instruction set architecture supporting vector instructions operating on vectors of scalable vector length to enable the same instruction sequence to be executed on apparatuses with hardware supporting different maximum vector lengths. The instruction decoding circuitry and the processing circuitry support a sub-vector-supporting instruction which treats a given vector as comprising a plurality of sub-vectors with each sub-vector comprising a plurality of vector elements. In response to the sub-vector-supporting instruction, the instruction decoding circuitry controls the processing circuitry to perform an operation for the given vector at sub-vector granularity. Each sub-vector has an equal sub-vector length.
One or more lighting components are projected onto pixel locations of a rendered image with sampling locations set off from pixel location centers according to associated jitter vectors. The sampled image is denoised in way that preserves the associated jitter vectors, and may be performed separately for different lighting components. The denoised image is processed using upsampling and/or temporal antialiasing, using the associated jitter vectors, to an image format having a spatial resolution at least as high as the denoised image.
Various implementations described herein are directed to a device having a write circuit that provides data for storage. The device may include a memory circuit that stores the data in leaky bitcells with capacitive elements that gradually discharge over a pre-determined period of time. The device may include a read circuit that enables the leaky bitcells to operate as one or more memory storage elements. The device may include a query circuit that identifies matches between a query data and output data provided by the read circuit.
In a data processing system, a command stream provided to a processing resource to cause the processing resource to perform a processing task for an application executing on a host processor comprises a sequence of commands for execution by the processing resource to cause the processing resource to perform the processing operations for the processing task and one or more data save indicators that indicate data that is to be saved. In response to the processing resource receiving a request to suspend processing of the processing task, data indicated by one of the one or more data save indicators in the command stream is stored in memory.
The present techniques relate to voltage droop detection and there is disclosed circuitry for detecting a voltage droop event, the circuitry configured to: receive a clock signal from a clock distribution network; obtain, from a storage, a first predetermined value, a second predetermined value and a predetermined threshold count; obtain one or more measurement values associated with a system voltage; when a first measurement value of the one or more measurement values reaches the first predetermined value, initiate a count of clock cycles until a subsequent measurement value of the one or more measurement values reaches the second predetermined value, the second predetermined value being different from the first predetermined value; and when the count of clock cycles is lower than the predetermined threshold count, cause a control entity to take mitigation action.
G01R 19/165 - Indicating that current or voltage is either above or below a predetermined value or within or outside a predetermined range of values
G06F 1/30 - Means for acting in the event of power-supply failure or interruption, e.g. power-supply fluctuations
H03K 5/00 - Manipulation of pulses not covered by one of the other main groups of this subclass
H03K 5/14 - Arrangements having a single output and transforming input signals into pulses delivered at desired time intervals by the use of delay lines
The present techniques relate to mitigating droop conditions over state transitions in systems having dynamic voltage and frequency scaling and there is disclosed a method of controlling a dynamic voltage and frequency scaling circuit, comprising: initiating a transition from a first voltage and frequency state to a second voltage and frequency state; switching activity from a first nominal source to a first fallback source; retuning the first nominal source to become a second fallback source at the second voltage and frequency state; switching activity from the first fallback source to the second fallback source; retuning the first fallback source to become a second nominal source at the second voltage and frequency state; and switching activity from the second fallback source to the second nominal source.
The present techniques relate to mitigating droop conditions in systems having dynamic voltage and frequency scaling and there is disclosed a method of controlling a dynamic voltage and frequency scaling circuit, comprising: detecting a voltage droop relative to a current nominal voltage and frequency state; responsive to said current nominal voltage and frequency state having a corresponding fallback state in a safe operating zone of voltage and frequency, switching activity from a nominal source to a fallback source; and when a fallback to a safe operating zone is unavailable for said current nominal voltage and frequency state, pausing activity of the dynamic voltage and frequency scaling circuit.
The present techniques relate to monitoring of operating parameters at a circuit and disclose a method comprising: receiving, at a delay monitor from a power delivery network, a voltage signal representative of the voltage level of the voltage; receiving, at the delay monitor from a clock distribution network, a clock signal representative of an output clock of the clock distribution network; periodically generating, at the delay monitor, a measurement value responsive to the voltage signal and the clock signal; adjusting, at the delay monitor, a threshold level for the measurement value from a first threshold to a second threshold, where the second threshold level corresponds to a target voltage level; providing, from the delay monitor to the clock distribution network, a non-violation signal responsive to the measurement value reaching the second threshold.
The present techniques relate to a clock control scheme(s) and discloses circuitry for providing a clock signal to a sub-system of a processor, the circuitry comprising: a first clock selection stage to receive clock signals from a plurality of clock sources and, responsive to one or more first control signals, provide first and second clock signals to a second selection stage; a second selection component at the second selection stage to, responsive to one or more second control signals, select one of the first and second clock signals and output the selected clock signal as a mitigated clock signal.
The present techniques relate to a clock control scheme and related methods and circuitry in a system comprising one or more processor cores and there is disclosed a control state machine in a clock controller circuit comprises a sender operable to signal a request to a subordinate state machine and to store a request sent indicator in a store; a first receiver operable to receive an acknowledgement indicator signalled by the subordinate state machine and to clear the request sent indicator in the store; a delay component operable to hold the control state machine in a wait state; a second receiver operable to receive a request complete indicator signalled by the subordinate state machine; and the delay component responsive to receipt of the request complete indicator to release the control state machine from the wait state.
The present techniques relate to monitoring of a clock signal at a circuit and disclose a method comprising: receiving, at a delay monitor, a gateable clock signal; analysing, by the delay monitor, the clock signal to generate a measurement value, wherein the measurement value is responsive to the clock signal and/or a voltage; and comparing, by the delay monitor, the measurement value with a threshold; and storing the measurement value for further analysis when the comparison does not meet the threshold; or discarding the measurement value when the measurement value meets the threshold.
Docket No. ARM-054-042PCT P08009EP.family/P07754EP.family 2 ABSTRACT Applications of improved detection of behavior from multi-variable data are provided. The applications are each described by a method that enables an evaluation of the degree of compatibility between more than one variable. The methods begin with inputs in a form of data. The inputs flow into an encoding step where encoders are configured to look for patterns in the inputs. A predicting step utilizes predictors to predict the next set of inputs. An energy function performs a comparing step that compares the predicted next set of inputs with data to determine if the predicted next set of inputs are compatible with one another or not. The comparison is used to detect that a certain behavior has occurred.
The present techniques relate to a method(s) and circuit(s) for implementing a voltage droop response and discloses a method of responding to a voltage droop in an electronic circuit; the method comprising: switching activity, in response to a voltage droop event, from a nominal clock source to a fallback clock source after a predetermined delay in time according to a programmable delay value.
Various implementations described herein are directed to a device having an array of bitcells with a first bitcell disposed adjacent to a second bitcell. The device may have a first wordline coupled to first transistors in the first bitcell, and the device may have a second wordline coupled to second transistors in the second bitcell. Also, the device may have a buried ground line coupled to the first transistors and the second transistors.
G11C 11/412 - Digital stores characterised by the use of particular electric or magnetic storage elementsStorage elements therefor using electric elements using semiconductor devices using transistors forming cells with positive feedback, i.e. cells not needing refreshing or charge regeneration, e.g. bistable multivibrator or Schmitt trigger using field-effect transistors only
An apparatus has bridge circuitry to communicate transport packets between a transport network and data processing circuitry. Some or all of the data processing circuitry operates in a given reset domain other than the reset domain of the transport network. The apparatus also has packet tracking circuitry to monitor the transport packets received at the bridge circuitry and to track whether any required responses have been provided. The reset handling circuitry is responsive to a reset request for the given reset domain to: cause the bridge circuitry to reject new transport packets received for the domain; and when the packet tracking circuitry indicates that all required responses have been provided, accept the reset request for the given reset domain and allow the data processing circuitry to carry out a reset for the data processing circuitry operating in the given reset domain.
There is described a method of monitoring an electronic circuit voltage droop response; the method comprising: switching activity, in response to a voltage droop event, from a nominal clock source to a fallback clock source; and, optionally, switching activity, in response to a voltage recovery event, from a fallback clock source to a nominal clock source; wherein a voltage recovery event comprises a predetermined duration according to a configurable delay value without a voltage droop. The method further comprises at least one of: measuring a number of instances of switching from a nominal clock source to a fallback clock source; measuring an actual duration during which activity proceeds according to the fallback clock source without a voltage droop; measuring a fallback duration during which activity proceeds according to the fallback clock source; and measuring a number of instances of switching from a fallback clock source to a nominal clock source occurring during a voltage droop. Finally, the method may comprise modifying the switching to optimise activity efficiency based on the measuring. There is also described an electronic circuit configured to monitor voltage droop response of another electronic circuit according to the method.
The present techniques relate to a method and circuitry for determining system characteristics of an electronic circuit and there is disclosed a delay monitor circuit to characterise an electronic circuit comprising: a delay line that quantifies the delay within a clock cycle; the delay line comprising a plurality of sampling points therealong; wherein, in a first mode, the delay monitor is configured to capture delay statistics over a given measurement period; and wherein, in a second mode, the delay monitor is configured to capture a measurement value from the plurality of sampling points, wherein the measurement value is indicative of one or more characteristics of the electronic circuit.
H03K 5/14 - Arrangements having a single output and transforming input signals into pulses delivered at desired time intervals by the use of delay lines
H03K 5/135 - Arrangements having a single output and transforming input signals into pulses delivered at desired time intervals by the use of time reference signals, e.g. clock signals
34.
Temperature and Voltage Profiling Computer Systems and Methods
According to one implementation of the present disclosure, a method of profiling the temperature and voltage across different locations within a processor is disclosed. The method includes: in a first stage, determining respective first and second voltage sensitivity coefficients and respective first and second temperature sensitivity coefficients corresponding to a pair of ring oscillators; and in a second stage, determining a voltage deviation and a temperature deviation from a predetermined reference voltage and a predetermined reference temperature respectively, based on the determined respective first and second voltage sensitivity coefficients and the determined respective first and second temperature sensitivity coefficients.
Various implementations described herein are directed to a method that acquires operating frequencies for a first set of ring oscillators disposed in a first integrated circuit, determines one or more first coefficients and a first constant for each ring oscillator in the first set, and determines a correlation between each of the first coefficients and the first constant. Also, the method may acquire a single operating frequency for each of a second set of ring oscillators in a second integrated circuit at a single pre-determined temperature so as to determine a second constant, predict one or more second coefficients for each ring oscillator in the second set based on the second constant and the correlation, and derive a temperature dependence based on the single operating frequency using the one or more second coefficients and the second constant for each of the second set of ring oscillators.
G01K 7/20 - Measuring temperature based on the use of electric or magnetic elements directly sensitive to heat using resistive elements the element being a linear resistance, e.g. platinum resistance thermometer in a specially-adapted circuit, e.g. bridge circuit
G01K 7/24 - Measuring temperature based on the use of electric or magnetic elements directly sensitive to heat using resistive elements the element being a non-linear resistance, e.g. thermistor in a specially-adapted circuit, e.g. bridge circuit
36.
EVALUATING PERFORMANCE OF A DROOP MITIGATION SCHEME
The present techniques relate to droop mitigation scheme and there is disclosed a method of evaluating the performance of a droop mitigation scheme, wherein the method is carried out at a circuit, the method comprising: receiving a clock output signal, wherein the droop mitigation scheme has been used to generate the clock output signal; and analysing the clock output signal to generate an output, wherein the output provides an indication of the performance of the droop mitigation scheme.
A computer implemented method for processing instructions in a multiprocessing apparatus comprises obtaining a first instruction of a first process; decoding the first instruction to detect a continuation indicator associated with the first instruction; determining whether or not to enforce the continuation indicator; and when it is determined to enforce the continuation indicator: continuing to execute the first process until completion of the first instruction and at least a next sequential second instruction of the first process. The continuation may temporarily suppress a normal eviction process based on a fairness algorithm, for example.
The present techniques relate to droop detection and there is disclosed circuitry for detecting a voltage droop event, the circuitry configured to: receive a clock signal from a clock distribution network; monitor a measurement value associated with a voltage provided to the circuitry by a power delivery network; provide a first droop detection signal to the clock distribution network in response to the measurement value reaching a first threshold value; cause a control entity to take a first mitigation action in response to the first droop detection signal; provide a second droop detection signal to the clock distribution network in response to the measurement value reaching a second threshold value different to the first threshold value; and cause the control entity to take a second mitigation action in response to the second droop detection signal.
A tile-based graphics processor performs first and second processing passes to generate a render output. The first processing pass generates and writes out information representative of a set of bounding boxes, and the second processing pass uses the bounding box information to determine which primitives to process for which rendering tiles.
A tile-based graphics processor performs first and second processing passes to generate a render output. The first processing pass generates data that is used in the second processing pass to determine which primitives to process for which rendering tiles. The first processing pass is performed by a geometry processing control unit assembling primitives, and one or more programmable processing units transforming geometry data defining the primitives, and processing the transformed geometry data to generate the data.
Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, techniques to process pixel values sampled from a multi color channel imaging device. In particular, methods and/or techniques to process pixel samples for interpolating pixel values for one or more color channels.
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06T 7/90 - Determination of colour characteristics
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
G06V 10/50 - Extraction of image or video features by performing operations within image blocksExtraction of image or video features by using histograms, e.g. histogram of oriented gradients [HoG]Extraction of image or video features by summing image-intensity valuesProjection analysis
An apparatus for improving the tracking of streams of memory accesses for training a stride prefetcher is provided, comprising a training data structure storing entries for training a stride prefetcher, a given entry specifying: a stride offset, a target address, a program counter address, and a bypass indicator indicating whether a program counter match condition is to be bypassed for the given entry; and training control circuitry to determine whether to update the stride offset for the given entry of the training data structure to specify a current stride between a target address of a current memory access and the target address for the last memory access of the tracked stream, in which the determination by the training control circuitry is controlled to be either dependent on a determination of whether the program counter match condition is satisfied or independent of whether the program counter match condition is satisfied, based on the bypass indicator.
G06F 12/0862 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
Apparatuses, methods, systems, chip containing products, and computer readable media are disclosed. An apparatus comprises dispatch circuitry to receive instructions, and to identify linear chains of instructions each comprising a first instruction and one or more further instructions, which are temporarily ineligible for execution due to a dependence on an immediately preceding instruction. The apparatus further comprises offline storage circuitry. The dispatch circuitry is configured, for each of the linear chains: to dispatch the sequentially first instruction to the issue circuitry and to retain the one or more further instructions in the offline storage circuitry until a chain trigger signal is received, the chain trigger signal indicating that a previously dispatched instruction, on which a sequentially next instruction depends, has satisfied a predefined issuing condition. In response to receipt of the chain trigger signal, the dispatch circuitry is configured to dispatch the sequentially next instruction to the issue circuitry.
Address translation circuitry 16 translates a virtual address specified by a memory access request issued by requester circuitry into a target physical address (PA). Requester-side filtering circuitry 20 performs a granule protection lookup based on the target PA and a selected physical address space (PAS) associated with the memory access request, to determine whether to allow the memory access request to be passed to a cache or interconnect. In the granule protection lookup, the requester-side filtering circuitry obtains granule protection information corresponding to a target granule of physical addresses including the target PA, which indicates at least one allowed PAS associated with the target granule, and blocks the memory access request when the granule protection information indicates that the selected PAS is not an allowed PAS.
G06F 12/14 - Protection against unauthorised use of memory
G06F 12/0808 - Multiuser, multiprocessor or multiprocessing cache systems with cache invalidating means
G06F 12/1045 - Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
There is provided an issuer apparatus, translation apparatus and access apparatus for providing a secure mechanism for accessing memory. When the issuer requests an address translation for a block of memory, the translation apparatus provides a translated address and a signature. The signature contains the address and a nonce and is signed using a private key that is not known to the issuer. When the issuer accesses the translated address, it provides the signature. The nonce is updated each time the permissions of the block of memory change. Consequently, the signature can be checked for validity when an access request is made. If the nonce is old, then the access request is prohibited. The issuer cannot simply modify the signature because it cannot access the private key. Consequently, memory is kept secure.
G06F 12/1072 - Decentralised address translation, e.g. in distributed shared memory systems
G06F 12/14 - Protection against unauthorised use of memory
H04L 9/32 - Arrangements for secret or secure communicationsNetwork security protocols including means for verifying the identity or authority of a user of the system
Cache invalidation circuitry responds to a cache invalidation command specifying invalidation scope information indicative of at least one invalidation condition, to control a cache to perform an invalidation process to invalidate cache entries satisfying the invalidation condition(s). Cache lookup circuitry issues to the cache a cache lookup request specifying address information, to request that the cache returns a cache lookup response. Cache lookup response filtering circuitry is responsive to a given hit-indicating cache lookup response which provides cached information and invalidation qualifying information returned from a corresponding valid cache entry, to determine whether the given hit-indicating cache lookup response conflicts with an in-progress cache invalidation command, based on the invalidation scope information specified by the in-progress cache invalidation command and the invalidation qualifying information, and when conflict is detected, causes the given hit-indicating cache lookup response to be treated as a miss-indicating cache lookup response.
A method and system for processing image data having a first bit depth using at least one trained neural network configured to operate on data having a second bit depth, where the second bit depth is smaller than the first bit depth by generating a plurality of image data portions by splitting the image data. Each of the plurality of image data portions is encoded to produce a plurality of encoded image data portions having the second bit depth. The plurality of image data portions are then processed by at least one trained neural network, before being decoded and combined to produce composite image data. The composite image data is then output.
When performing texture processing operations in a graphics processing system, for a texture processing operation that requires M input texture data elements from an array of texture data elements, each of the M texture data elements is selected from a different set of texture data elements having a different set of positions within the texture data array. The texture processing operation is then performed using the M texture data elements.
When generating a sequence of render outputs using a graphics processor, the completion status of rendering tasks from different render outputs is tracked so that processing tasks for later render outputs in the sequence of outputs can be processed concurrently with processing tasks for earlier render outputs in the sequence of outputs whilst ensuring that any dependencies between the rendering tasks for the different render outputs are enforced. In particular, there is disclosed a mechanism for suspending the sequence of rendering jobs (so that it may subsequently be resumed).
When performing a sequence of rendering jobs, rendering tasks for separate rendering jobs are permitted to overlap within the graphics processor's processing (shader) cores. A record is maintained of which rendering tasks are currently being processed by the graphics processor's processing (shader) cores which record can then be used to enforce any data (processing) dependencies between different rendering jobs.
A method of managing write-after-read (WAR) hazards in a graphics processor. A host processor when preparing a graphics processor command stream can identify possible WAR hazards between rendering jobs for example by detecting layout transitions and insert a suitable barrier into the graphics processor command stream. The graphics processor when encountering such a barrier can then determine whether it is possible to ignore the barrier and allow rendering jobs to be processed concurrently.
A method of operating a graphics processor when performing a certain sequence of rendering jobs that produces a series of progressively lower resolution versions of the same render output comprising issuing rendering tasks for different rendering jobs concurrently and controlling processing for a later rendering job using a respective ‘task completion status’ data structure associated with the earlier rendering job on which it depends, wherein the looking up of respective entries in the ‘task completion status’ data structure takes into account the change in resolution between the first, earlier rendering job and the second, later rendering job.
When performing a texture sampling operation that uses the results of plural texture filtering operations to provide an overall output sampled texture value in a graphics processing system, it is determined whether a texture filtering operation in the set of plural texture filtering operations that are to be performed to provide the overall output sampled texture value can be at least partially merged with another texture filtering operation in the set of texture filtering operations. If so a merged texture filtering operation is performed for the two texture filtering operations, with the result of the merged texture filtering operation being used when providing the overall output sampled texture value.
When generating a sequence of render outputs using a graphics processor, the completion status of rendering tasks for different render outputs is tracked so that processing tasks for later render outputs in the sequence of outputs can be processed concurrently with processing tasks for earlier render outputs in the sequence of outputs whilst ensuring that any dependencies between the rendering tasks are enforced.
An apparatus is disclosed comprising decoder circuitry to decode instructions, wherein the decoder circuitry is responsive to a sequence of instructions to generate control signals, and processing circuitry responsive to the control signals to perform operations defined by the sequence of instructions. The decoding circuitry is responsive to a pointer control prefix instruction in the sequence of instructions to generate one or more control signals to cause the processing circuitry to incorporate a pointer control operation in association with a data processing operation defined by a given data processing instruction subsequent to the pointer control prefix instruction in the sequence, to control whether the data processing operation produces a result operand that is to be considered valid in a situation where a given input operand for the given data processing instruction is a pointer operand identifying a pointer used to determine an address to access memory.
A widening vector load instruction specifies at least one address operand and two or more vector destination registers each for specifying a vector operand having a given vector length. In response to decoding of the widening vector load instruction, at least one micro- operation is issued to control processing circuitry 16, 56 to: load at least one vector of data elements from a location in a memory system 30, 32, 34 corresponding to a target memory address determined based on the at least one address operand; widen the data elements of the loaded at least one vector from a first data element size to a second data element size larger than the first data element size; and write the widened data elements having the second data element size to the two or more vector destination registers as respective vector operands having the given vector length.
There is provided an apparatus, system, chip-containing product, method, and storage medium. The apparatus comprises memory access circuitry responsive to one or more types of memory access request, to retrieve specified data items from memory. The apparatus is also provided with local storage circuitry configured to store at least some of the retrieved data items. The local storage circuitry is N-way associative, and N is greater than 1. The apparatus is also provided with control circuitry responsive to an indication that an access request signalled to the local storage circuitry relating to an accessed data item corresponds to a predefined type of memory access request, to implement a restrictive access policy in relation to the accessed data item in the local storage circuitry. The restrictive access policy excludes at least one step of accessing an excluded subset of ways of the local storage circuitry.
When generating a sequence of render outputs using a graphics processor, the completion status of rendering tasks from different render outputs is tracked so that processing tasks for later render outputs in the sequence of outputs can be processed concurrently with processing tasks for earlier render outputs in the sequence of outputs whilst ensuring that any dependencies between the rendering tasks for the different render outputs are enforced.
A method of preparing a command stream for a parallel processor, comprising: analysing the command stream to detect at least a first dependency; generating at least one timeline dependency point responsive to detecting the first dependency; determining a latest action for the first dependency to derive a completion stream timeline point for the first dependency; comparing the completion stream timeline point for the first dependency with a completion stream timeline point for a second dependency to determine a latest stream timeline point; generating at least one command stream synchronization control instruction according to the latest stream timeline point; and providing the command stream and the at least one command stream synchronization control instruction to an execution unit of the parallel processor.
Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to enhance a rendered image. In an implementation, a process to enhance a portion of a rendered image may be affected based, at least in part, on a shading rate applied in rendering the portion of the rendered image.
When preparing and storing primitive lists in a tile-based graphics processing system, one or more primitive list pointer arrays store pointers, each pointer indicating a location in storage of one or more of the primitive lists. A further pointer array stores further pointers, each further pointer indicating a location in storage of one or more of the primitive list pointer arrays.
An apparatus is provided for varying paths from power sources to components in order to inhibit side channel attacks. The power source provides power. The circuit component consumes the power to perform a function and a power grid provides a plurality of redundant paths by which the power can flow from between the circuit component and one of a power source and ground, to perform the function. The power grid is dynamically selects at least one active path of the redundant paths through which the power flows to perform the function.
H02J 3/00 - Circuit arrangements for ac mains or ac distribution networks
G06F 21/72 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information in cryptographic circuits
An apparatus comprises instruction decoding circuitry (10) to decode instructions and issue circuitry (12) to issue at least one micro-operation to control processing circuitry (16, 56) to perform a processing operation. In response to decoding of a narrowing vector store instruction specifying a plurality of vector source registers each for specifying a vector operand, the instruction decoding circuitry is configured to control the issue circuitry to issue at least one micro-operation to control the processing circuitry to narrow data elements of the plurality of vector source registers to a first data element size from a second data element size, the first data element size being smaller than the second data element size, and store, to a location in a memory system, at least one vector of narrowed data elements comprising data elements of the plurality of vector source registers narrowed to the first data element size.
A processing element is configured to approximate a transcendental function. The processing element comprises an input storage and a look-up storage. The processing element obtains floating-point input data from the input storage representing having an input exponent value and an input mantissa value. The processing element looks up approximation parameters and an output exponent value from the look-up storage, wherein each group of approximation parameters and output exponent value are stored in the look-up storage in association with a respective range of a plurality of ranges that are defined by the input exponent value and the input mantissa value. The ranges cover values of the input exponent value and input mantissa value such that the output exponent value associated with each range does not change by more than a predetermined number. An approximation function is evaluated that approximates the transcendental function based on the looked-up approximation parameters and output exponent.
A first image frame in a first resolution format comprising image signal intensity values mapped to first pixel locations has sampling points offset from the centers of first pixel locations according to associated jitter vectors. The image signal intensity values are mapped to second pixel locations in a second image frame in a second resolution format based at least in part on the jitter vectors, the second resolution format being higher resolution than the first resolution format. The mapped image signal intensity values are combined with image signal intensity values of an accumulated history image buffer according to coefficients predicted by a neural network such as based on magnitudes of the jitter vectors. Interpolated pixel image intensities are added to the accumulated history buffer for empty or null pixel locations in the second image frame in the second image format. Upsampling artifacts such as checkerboarding and aliasing are reduced.
A memory instance comprises a plurality of banks of storage cells to store data values, and input/output circuitry shared between the plurality of banks for receiving write data or outputting read data. Each bank of storage cells supports a power saving mode and an operational mode. A control interface receives power control signals for controlling use of the power saving mode. Bank power control circuitry individually controls, for each of a plurality of subsets of banks of storage cells within the same memory instance, whether that subset of banks is in the power saving mode based on the power control signals. For at least one setting for the power control signals, one subset of banks is in the power saving mode while another subset of banks in the same memory instance is in the operational mode. Also disclosed is power control circuitry which selects the power mode to use for each subset of banks and generates the power control signals.
According to the present techniques there is provided a method of operating a data processor unit to generate processing tasks. The data processor unit comprises a control circuit configured to receive, from a host processor unit, a request for the data processor unit to perform processing jobs and to generate a workload for each job. Each workload comprises one or more tasks. The data processor unit further comprises first and second execution units to process the workloads. The method comprises: receiving, at the control circuit, a request to perform first and second processing jobs; generating, at the control circuit in response to the request, a primary workload for the first processing job, and a secondary workload for the second processing job; generating, at the control circuit, one or more operation instructions to control processing of the primary and/or secondary workloads at the first and/or second execution units; processing, at the first execution unit, the primary workload in accordance with the operation instructions; and processing, at the second execution unit, the secondary workload in parallel with the primary workload in accordance with the operation instructions.
Various implementations described herein are directed to a device having a power-gate structure (104) with multiple transistors (T1, T2) including a first transistor (t1) and a second transistor (t2). The first transistor may be coupled between a first voltage node (n1) and a second voltage node (n2), and the second transistor (T2) may be coupled between the second voltage node (n2) and a third voltage node (n2) that is coupled to the second voltage node (n2).
H03K 19/00 - Logic circuits, i.e. having at least two inputs acting on one outputInverting circuits
H01L 27/02 - Devices consisting of a plurality of semiconductor or other solid-state components formed in or on a common substrate including integrated passive circuit elements with at least one potential-jump barrier or surface barrier
H01L 27/092 - Devices consisting of a plurality of semiconductor or other solid-state components formed in or on a common substrate including integrated passive circuit elements with at least one potential-jump barrier or surface barrier the substrate being a semiconductor body including only semiconductor components of a single kind including field-effect components only the components being field-effect transistors with insulated gate complementary MIS field-effect transistors
Various implementations described herein are directed to a device having a bank of bitcells split into a plurality of portions including a first row slice of the bitcells and a second row slice of the bitcells. Also, the device may have control circuitry configured to access and repair a first bitcell in the first row slice with a first row address and a second bitcell in the second row slice with a second row address that is different than the first row address.
System emulation of a floating-point dot product operation can be performed without directly performing the arithmetic by decomposing the Addend into a constituent sign, an exponent, and a fractional part; performing inverse scaling of the Addend by subtracting a scaling exponent (LSCALE) of a scaling of a negative power of two from the exponent to calculate an inverse-scaled addend; comparing a corresponding fractional part of the inverse-scaled addend with notional exponents of the most significant bit (MSB) and the least significant bit (LSB) of a fixed point accumulator to determine which of three cases have been encountered; and adding particular values representing the Addend to the calculation result according to which of the three cases have been encountered. The three cases include the inverse-scaled addend being able to be exactly accumulated into the fixed-point accumulator and the scenarios where the inverse-scaled addend is either too large or too small to be exactly accumulated into the fixed-point accumulator.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
G06F 5/01 - Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
Various implementations described herein are directed to a device having a bank of bitcells split into a plurality of portions including a first row slice of the bitcells and a second row slice of the bitcells. Also, the device may have control circuitry configured to access and repair a first bitcell in the first row slice with a first row address and a second bitcell in the second row slice with a second row address that is different than the first row address.
Various implementations described herein are directed to a device having a power-gate structure with multiple transistors including a first transistor and a second transistor. The first transistor may be coupled between a first voltage node and a second voltage node, and the second transistor may be coupled between the second voltage node and a third voltage node that is coupled to the second voltage node.
A data processing apparatus includes pointer storage configured to store pointer values for pointers. Increment circuitry, responsive to one or more increment events, increments each of the pointer values in dependence on a corresponding live pointer value update condition from corresponding live pointer value update conditions. The corresponding live pointer value update condition is different for each of the pointers. History storage circuitry stores resolved behaviours of instances of a control flow instruction, each of the resolved behaviours being associated with one of the pointers. At least one of the live pointer value update conditions is changeable at runtime. Consequently, storage can be reduced as compared to a situation where all pointer value update conditions are active.
An apparatus comprises a plurality of interfaces, each couplable to a respective one of a plurality of processing circuitries either in a higher criticality compliance state or a lower criticality compliance state. Each interface can receive from its respective processing circuitry interrupt signals destined to a target processing circuitry of the plurality of processing circuitries and transmit to its respective processing circuitry interrupt signals issued by a source processing circuitry of the plurality of processing circuitries. Control circuitry monitors the flow of the interrupt signals and determines whether the flow of interrupt signals exhibits a discrepancy with respect to an expected flow of interrupt signals, and performs a mitigation action in respect of said discrepancy to avoid violation of the higher criticality compliance state.
Various implementations described herein are directed to a device with magnetic plates embedded within a substrate having a first magnetic flux. The device may include one or more conductive cylinders that provide an electric path that passes through the magnetic plates. Also, the magnetic plates may enable a second magnetic flux within the magnetic plates such that the second magnetic flux is greater than the first magnetic flux.
An apparatus is provided for improving the use of multiple-issue operations in a data processor. A variable-issue operation can be recognised is being either a single-issue operation or a multiple-issue operation in dependence on the state of the program at runtime. If a variable-issue operation can be scheduled as a multiple-issue operation, then other operations can be scheduled for performance in the same cycle, when they would have otherwise had to be scheduled for a later cycle. As such, more operations can be performed in fewer cycles thus improving code density and improving data processing performance.
Example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to adapt a neural network structure to a target platform. One or more performance metrics of an execution of the neural network structure may be implemented by one or more target hardware elements. A module from a library of modules may be selected to replace one or more elements of the neural network structure based, at least in part, on the observed one or more performance metrics.
Disclosed are devices and/or processes to process image frames expressed in part by different lighting components, such as lighting components generated using ray tracing. In an embodiment, different lighting components of a previous image frame may be separately warped and combined with like lighting components in a current image frame.
A data processing apparatus is provided. It includes history storage circuitry that stores historic data of instructions and prediction circuitry that predicts a historic datum of a specific instruction based on subsets of the historic data of the instructions. The history storage circuitry overwrites the historic data of one of the instructions to form a corrupted instruction datum and at least one of the subsets of the historic data of the instructions includes the corrupted historic datum.
Barcelona Supercomputing Center - Centro Nacional de Supercomputación (Spain)
Inventor
Siracusa, Marco
Randall, Joshua
Joseph, Douglas James
Moretó Planas, Miquel
Armejach Sanosa, Adrià
Abstract
A data structure marshalling unit for a processor comprises data structure traversal circuitry to perform data structure traversal processing according to a dataflow architecture. The data structure traversal circuitry comprises two or more layers of traversal circuit units, each layer comprising two or more parallel lanes of traversal circuit units. Each traversal circuit unit triggers loading, according to a programmable iteration range, of at least one stream of elements of at least one data structure from data storage circuitry. For at least one programmable setting for the data structure traversal circuitry, the programmable iteration range for a given traversal circuit unit in a downstream layer is dependent on one or more elements of the at least one stream of elements loaded by at least one traversal circuit unit in an upstream layer. Output interface circuitry outputs to the data storage circuitry at least one vector of elements loaded by respective traversal circuit units in a given active layer of the data structure traversal circuitry.
Various implementations described herein are directed to a device having first transistors arranged as cross-coupled inverters coupled between a disconnect node and ground. The device may have second transistors arranged as passgates coupled between the cross-coupled inverters and bitlines. The device may have third transistors coupled between a voltage supply and the disconnect node.
G11C 11/412 - Digital stores characterised by the use of particular electric or magnetic storage elementsStorage elements therefor using electric elements using semiconductor devices using transistors forming cells with positive feedback, i.e. cells not needing refreshing or charge regeneration, e.g. bistable multivibrator or Schmitt trigger using field-effect transistors only
Various implementations described herein are directed to a device having first transistors arranged as cross-coupled inverters coupled between a disconnect node and ground. The device may have second transistors arranged as passgates coupled between the cross-coupled inverters and bitlines. The device may have third transistors coupled between a voltage supply and the disconnect node.
A method of core allocation for temperature control in under-subscribed computing systems can include maintaining, by an allocator component, a virtual to physical core mapping of physical cores of a computing system; identifying a change to a next core allocation according to an allocation pattern; determining a time that a process of a software application can be moved to the next core allocation, wherein the process of the software application is configured with core pinning; and updating, by the allocator component, the virtual to physical core mapping for the process of the software application at the determined time. The allocator component can intercept a process request of the software application; identify a pinned core of the process request; and overwrite the pinned core of the process request with an allocated core indicated by the virtual to physical core mapping.
Methods and systems for reducing the appearance of block-related artifacts are described. The methods include obtaining image frames in a sequence of image frames and adjusting some or all image frames in the sequence of image frames to generate adjusted image frames. The adjusted image frames may be created by generating image data and adding it to one or more edges of the image frame or by discarding image data from the image frame. The adjusted image frames shift a block origin relative to image data in the image frame. The adjusting is performed so that the shift of the block origin varies during the sequence of image frames. A block-based process is applied to each adjusted image frame to generate processed image frames, wherein blocks of image data are selected and processed in each adjusted frame of image data according to the block origin.
A data processing apparatus is provided. It includes first history storage circuitry that stores control flow information of control flow instructions. Second history storage circuitry stores a subset of the control flow information by considering a subset of the control flow instructions. Prediction circuitry produces a prediction for a specific one of the control flow instructions based on the subset of the control flow information and power control circuitry performs a determination of an extent to which the subset of the control flow information matches the control flow information and disables the prediction circuitry in dependence on a result of the determination.
Multiple successively spatially downsampled versions of a rendered image frame are generated for at least one two-dimensional signal component of one or more two-dimensional signal components of an image frame, and one or more versions of the rendered image frame are selected from among the rendered image frame and the spatially downsampled versions of the rendered image frame for sampling a texture feature based, at least in part, on a prediction computed by a neural network.
Disclosed are process and devices for processing an image frame. An image frame rendered on rendering instances may comprise pixel values for one or more lighting components. One or more first buffers may store accumulations of pixel values for the one or more lighting components over at least some past rendering instances. One or more second buffers may store pixel values for the one or more lighting components of a processed image frame. Pixel values in the one or more first buffers may be combined with pixel values in the one or more second buffers to provide pixel values for the one or more lighting components in an accumulated image frame.
According to one implementation of the present disclosure, a power grid comprising: one or more cells; a metal layer; first and second buried power rails; and one or more local interconnects, wherein one or more local interconnect stitches are configured to electrically couple the one or more cells to either of the first or second buried power rails through the metal layer and the one or more local interconnects.
H01L 23/522 - Arrangements for conducting electric current within the device in operation from one component to another including external interconnections consisting of a multilayer structure of conductive and insulating layers inseparably formed on the semiconductor body
H01L 27/02 - Devices consisting of a plurality of semiconductor or other solid-state components formed in or on a common substrate including integrated passive circuit elements with at least one potential-jump barrier or surface barrier
When performing tile-based graphics processing, a first vertex shading operation to generate vertex shaded position data for vertices is performed, and the vertex shaded position data used to prepare primitive lists indicating which primitives should be rendered for respective rendering tiles. Then, when processing a tile, a second vertex shading operation is performed for vertices of primitives for the tile for which fragments have been generated by a rasteriser prior to rendering the graphics fragments, to generate vertex shaded non-position attribute data for the vertices, based on the results of early depth testing before the fragments are rendered.
An apparatus has pointer storage to store pointer values for a plurality of pointers and increment circuitry, responsive to a series of increment events, to differentially increment the pointer values of the pointers. Training circuitry comprises tracker circuitry to maintain a plurality of tracker entries and cache circuitry to maintain a plurality of cache entries. Each tracker entry identifies a control flow instruction, and each cache entry stores a resolved behaviour of an instance of a control flow instruction identified by a tracker entry. For a given control flow instruction identified in a given tracker entry, the training circuitry performs a training process to seek to determine, as an associated pointer for the given control flow instruction, a pointer from amongst the plurality of pointers whose pointer value increments in a manner that meets a correlation threshold with occurrence of instances of the given control flow instruction. Promotion circuitry, responsive to detection of the correlation threshold being met for the given control flow instruction, allocates a prediction entry within prediction circuitry to identify the given control flow instruction and the associated pointer, and a behaviour record is established within the prediction entry identifying the resolved behaviour for one or more instances of the given control flow instruction. The behaviour record is arranged such that each resolved behaviour is associated with the pointer value of the associated pointer at the time that resolved behaviour was observed. Responsive to a prediction trigger associated with a replay of a given instance of the given control flow instruction, the prediction circuitry determines, in dependence on a current pointer value of the associated pointer, a predicted behaviour of the given instance of the given control flow instruction from the behaviour record within the prediction entry.
Prediction circuitry generates a prediction associated with a prediction input address, for controlling a speculative action by a processor. The prediction circuitry comprises combiner circuitry to determine a combined prediction by applying a prediction combination function to a given address and sets of prediction information generated by a plurality of predictors corresponding to the given address. A combiner cache structure comprises combiner cache entries. A given combiner cache entry associated with an address indication indicates items of combined prediction information determined by the combiner circuitry for an address corresponding to the address indication and different combinations of possible values for the respective sets of prediction information. Combiner cache lookup circuitry looks up the combiner cache structure based on the prediction input address to identify a selected combiner cache entry, and generates the prediction based on a selected item of combined prediction information selected from the selected combiner cache entry based on the respective sets of prediction information generated by the predictors corresponding to the prediction input address.
G06F 12/0862 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
92.
MEMORY SYNCHRONISATION SUBSEQUENT TO A MAINTENANCE OPERATION
There is provided an apparatus, system, method, and medium. The apparatus comprises one or more processing elements, each processing element comprising processing circuitry to perform processing operations in one of a plurality of processing contexts. Each processing element further comprises context tracking circuitry to store context tracking data indicative of active contexts. Each processing element comprises control circuitry responsive to a request for a memory synchronisation occurring subsequent to at least one maintenance operation associated with a given set of one or more contexts of the plurality of processing contexts, to determine whether the at least one of the given set of one or more contexts is indicated in the context tracking data. The control circuitry is configured, when the at least one of the given set of one or more contexts is determined to be indicated in the context tracking data, to implement a delay before performing the memory synchronisation, and when each of the given set of one or more contexts is determined to be absent from the context tracking data, to perform the memory synchronisation without implementing the delay.
An apparatus stores pointer values for pointers which are incremented differentially and has prediction circuitry to maintain prediction entries each identifying a control flow instruction, an associated pointer, and a behaviour record indicating resolved behaviour of the control flow instruction. Resolved behaviour stored in a selected element of the behaviour record identified using a pointer value of the associated pointer may be used as predicted behaviour for a control flow instruction. The prediction entries include a first type of prediction entry and a further type of prediction entry, where prediction circuitry uses each prediction entry of the first type to identify a control flow instruction whose associated pointer is within a first subset of the pointers, and uses each prediction entry of a further type to identify a control flow instruction whose associated pointer is within a further subset of the pointers excluding at least one pointer of the first subset.
Combiner circuitry generates a combined prediction associated with a given address based on combining respective sets of prediction information generated by two or more predictors. Predictor control circuitry determines, based on a lookup of a prediction input address in a combiner hint data structure, whether a second predictor lookup suppression condition is satisfied for the prediction input address indicating that the combined prediction that would be determined by the combiner circuitry for the prediction input address is likely to be derivable from a prediction outcome predicted by the first predictor for the prediction input address. If this condition is satisfied, a lookup of the second predictor is suppressed and the prediction associated with the prediction input address is generated based on the prediction outcome predicted by the first predictor for the prediction input address.
Data processing systems comprising a data processor, the data processor comprising an execution unit and storage for storing input data values for use by and/or output data values generated by the execution unit when executing instructions to perform data processing operations, and methods of control thereof, in which control of storage of data values for data source(s) of the storage is based on indication(s), in instruction(s) requiring use of data source(s) for a data processing operation, that one or more data values in the data source(s) are no longer required to be retained.
Circuitry including cache storage and control circuitry is provided. The cache storage includes an array of random access memory storage elements, and is configured to store data in multiple cache sectors, each cache sector including a number of cache storage data units. The control circuitry is configured to control access to the cache storage including, for example, accessing the cache storage data units in the cache sectors. After accessing a cache storage data unit in a cache sector, the energy requirement and/or latency for the next access to a cache storage data unit in the same sector is lower than the energy requirement and/or latency for the next access to a cache storage data unit in a different same sector.
A method is presented that includes detecting, by a loader, overlapping permissions for a page while loading a binary file. When writable data overlaps with read-only code in a page, the loader copies the code part of the page with the overlapping permissions to a new page. The original page is set non-executable. The new page can be set executable but read-only. When execution reaches the now non-executable original page, a segmentation fault may be raised. A signal handler installed by the loader detects that the fault is coming from the original page and redirects execution to the new page with the copied code part.
A data processing apparatus includes interception circuitry for intercepting an incoming signal corresponding to an instruction from a processor element to a PCI device. Respond circuitry provides a response to the incoming signal back to the processor element and the response is either an acceptance of the incoming signal or a refusal of the incoming signal based on a flow control between the data processing apparatus and the PCI device. Forward circuitry performs a transmission, to the PCI device, of an outgoing signal corresponding to the command after the response has indicated acceptance of the incoming signal.
A message channel functionality for a data processing system is disclosed. This provides communication channels which may be considered to be a shared resource. The approach combines atomic stores, which are fully completed in a single atomic transaction, and non-coherence to provide non-coherent atomic stores that are conditional to implement primitive communications channels that can be used to implement software queues and channels more efficiently. This enables the programmer to execute a store from registers on one side of a communications link and to have that data appear in the registers of a data consumer on that link directly, bypassing both the shared state upgrade problem and the parallel problem of acquiring a synchronization lock before data send.
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
A message channel functionality for a data processing system is disclosed. This provides communication channels which may be considered to be a shared resource. The approach combines atomic stores, which are fully completed in a single atomic transaction, and non-coherence to provide non-coherent atomic stores that are conditional to implement primitive communications channels that can be used to implement software queues and channels more efficiently. This enables the programmer to execute a store from registers on one side of a communications link and to have that data appear in the registers of a data consumer on that link directly, bypassing both the shared state upgrade problem and the parallel problem of acquiring a synchronization lock before data send.