A view of geometry captured in image data generated by an imaging sensor is compared with a description of the geometry in a volumetric data structure. The volumetric data structure describes the volume at a plurality of levels of detail and includes entries describing voxels defining subvolumes of the volume at multiple levels of detail. The volumetric data structure includes a first entry to describe voxels at a lowest one of the levels of detail and further includes a number of second entries to describe voxels at a higher, second level of detail, the voxels at the second level of detail representing subvolumes of the voxels at the first level of detail. Each of these entries include bits to indicate whether a corresponding one of the voxels is at least partially occupied with the geometry. One or more of these entries are used in the comparison with the image data.
A pruned version of a neural network is generated by determining pruned versions of each a plurality of layers of the network. The pruned version of each layer is determined by sorting a set of channels of the layer based on respective weight values of each channel in the set. A percentage of the set of channels are pruned based on the sorting to form a thinned version of the layer. Accuracy of a thinned version of the neural network is tested, where the thinned version of the neural network includes the thinned version of the layer. The thinned version of the layer is used to generate the pruned version of the layer based on the accuracy of the thinned version of the neural network exceeding a threshold accuracy value. A pruned version of the neural network is generated to include the pruned versions of the plurality of layers.
Example systems disclosed herein include a database to store records of operator-labeled video segments (e.g., as records of operator-labeled video segments). The operator-labeled video segments include reference video segments and corresponding reference event labels describing the video segments. Disclosed example systems also include a neural network including a first instance of an inference engine, and a training engine to train the first instance of the inference engine based on a training set of the operator-labeled video segments obtained from the database, the first instance of the inference engine to infer events from the operator-labeled video segments included in the training set. Disclosed example systems further include a second instance of the inference engine to infer events from monitored video feeds, the second instance of the inference engine being based on the first instance of the inference engine.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
An output of a first one of a plurality of layers within a neural network is identified. A bitmap is determined from the output, the bitmap including a binary matrix. A particular subset of operations for a second one of the plurality of layers is determined to be skipped based on the bitmap. Operations are performed for the second layer other than the particular subset of operations, while the particular subset of operations are skipped.
A view of geometry captured in image data generated by an imaging sensor is compared with a description of the geometry in a volumetric data structure. The volumetric data structure describes the volume at a plurality of levels of detail and includes entries describing voxels defining subvolumes of the volume at multiple levels of detail. The volumetric data structure includes a first entry to describe voxels at a lowest one of the levels of detail and further includes a number of second entries to describe voxels at a higher, second level of detail, the voxels at the second level of detail representing subvolumes of the voxels at the first level of detail. Each of these entries include bits to indicate whether a corresponding one of the voxels is at least partially occupied with the geometry. One or more of these entries are used in the comparison with the image data.
A machine learning system is provided to enhance various aspects of machine learning models. In some aspects. a substantially photorealistic three-dimensional (3D) graphical model of an object is accessed and a set of training images of the 3D graphical mode are generated, the set of training images generated to add imperfections and degrade photorealistic quality of the training images. The set of training images are provided as training data to train an artificial neural network.
A volumetric data structure models a particular volume representing the particular volume at a plurality of levels of detail. A first entry in the volumetric data structure includes a first set of bits representing voxels at a first level of detail, the first level of detail includes the lowest level of detail in the volumetric data structure, values of the first set of bits indicate whether a corresponding one of the voxels is at least partially occupied by respective geometry, where the volumetric data structure further includes a number of second entries representing voxels at a second level of detail higher than the first level of detail, the voxels at the second level of detail represent subvolumes of volumes represented by voxels at the first level of detail, and the number of second entries corresponds to a number of bits in the first set of bits with values indicating that a corresponding voxel volume is occupied.
A ray is cast into a volume described by a volumetric data structure, which describes the volume at a plurality of levels of detail. A first entry in the volumetric data structure includes a first set of bits representing voxels at a lowest one of the plurality of levels of detail, and values of the first set of bits indicate whether a corresponding one of the voxels is at least partially occupied by respective geometry. A set of second entries in the volumetric data structure describe voxels at a second level of detail, which represent subvolumes of the voxels at the first lowest level of detail. The ray is determined to pass through a particular subset of the voxels at the first level of detail and at least a particular one of the particular subset of voxels is determined to be occupied by geometry.
An output of a first one of a plurality of layers within a neural network is identified. A bitmap is determined from the output, the bitmap including a binary matrix. A particular subset of operations for a second one of the plurality of layers is determined to be skipped based on the bitmap. Operations are performed for the second layer other than the particular subset of operations, while the particular subset of operations are skipped.
The present application relates generally to a parallel processing device. The parallel processing device can include a plurality of processing elements, a memory subsystem, and an interconnect system. The memory subsystem can include a plurality of memory slices, at least one of which is associated with one of the plurality of processing elements and comprises a plurality of random access memory (RAM) tiles, each tile having individual read and write ports. The interconnect system is configured to couple the plurality of processing elements and the memory subsystem. The interconnect system includes a local interconnect and a global interconnect.
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G09G 5/36 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of individual graphic patterns using a bit-mapped memory
G09G 5/397 - Arrangements specially adapted for transferring the contents of two or more bit-mapped memories to the screen simultaneously, e.g. for mixing or overlay
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
Disclosed examples include accessing sensor data; recognizing, by executing an instruction with programmable circuitry, a feature in the sensor data based on a convolutional neural network; and transitioning, by executing an instruction with the programmable circuitry, a mobile device between at least two of motion feature detection, audio feature detection, or camera feature detection after the feature is recognized in the sensor data, the mobile device to operate at a different level of power consumption after the transition than before the transition.
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 40/20 - Movements or behaviour, e.g. gesture recognition
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
H04N 23/65 - Control of camera operation in relation to power supply
H04N 23/45 - Cameras or camera modules comprising electronic image sensorsControl thereof for generating image signals from two or more image sensors being of different type or operating in different modes, e.g. with a CMOS sensor for moving images in combination with a charge-coupled device [CCD] for still images
H04N 23/61 - Control of cameras or camera modules based on recognised objects
H04N 23/611 - Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
H04N 23/667 - Camera operation mode switching, e.g. between still and video, sport and normal or high and low resolution modes
G06F 16/16 - File or folder operations, e.g. details of user interfaces specifically adapted to file systems
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocodersCoding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 25/78 - Detection of presence or absence of voice signals
12.
Systems and methods for distributed training of deep learning models
Systems and methods for distributed training of deep learning models are disclosed. An example local device to train deep learning models includes a reference generator to label input data received at the local device to generate training data, a trainer to train a local deep learning model and to transmit the local deep learning model to a server that is to receive a plurality of local deep learning models from a plurality of local devices, the server to determine a set of weights for a global deep learning model, and an updater to update the local deep learning model based on the set of weights received from the server.
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
G06V 10/96 - Management of image or video recognition tasks
Example methods, apparatus, systems and articles of manufacture (e.g., physical storage media) to implement video surveillance with neural networks are disclosed. Example systems disclosed herein include a database to store records of operator-labeled video segments (e.g., as records of operator-labeled video segments). The operator-labeled video segments include reference video segments and corresponding reference event labels describing the video segments. Disclosed example systems also include a neural network including a first instance of an inference engine, and a training engine to train the first instance of the inference engine based on a training set of the operator-labeled video segments obtained from the database, the first instance of the inference engine to infer events from the operator-labeled video segments included in the training set. Disclosed example systems further include a second instance of the inference engine to infer events from monitored video feeds, the second instance of the inference engine being based on the first instance of the inference engine.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/52 - Surveillance or monitoring of activities, e.g. for recognising suspicious objects
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
A volumetric data structure models a particular volume representing the particular volume at a plurality of levels of detail. A first entry in the volumetric data structure includes a first set of bits representing voxels at a first level of detail, the first level of detail includes the lowest level of detail in the volumetric data structure, values of the first set of bits indicate whether a corresponding one of the voxels is at least partially occupied by respective geometry, where the volumetric data structure further includes a number of second entries representing voxels at a second level of detail higher than the first level of detail, the voxels at the second level of detail represent subvolumes of volumes represented by voxels at the first level of detail, and the number of second entries corresponds to a number of bits in the first set of bits with values indicating that a corresponding voxel volume is occupied.
Methods, systems, articles of manufacture and apparatus to generate digital scenes are disclosed. An example apparatus to generate labelled models includes a map builder to generate a three-dimensional (3D) model of an input image, a grouping classifier to identify a first zone of the 3D model corresponding to a first type of grouping classification, a human model builder to generate a quantity of placeholder human models corresponding to the first zone, a coordinate engine to assign the quantity of placeholder human models to respective coordinate locations of the first zone, the respective coordinate locations assigned based on the first type of grouping classification, a model characteristics modifier to assign characteristics associated with an aspect type to respective ones of the quantity of placeholder human models, and an annotation manager to associate the assigned characteristics as label data for respective ones of the quantity of placeholder human models.
An example stationary tracker includes memory to store fixed geographic location information indicative of a fixed geographic location of the stationary tracker, and to store a reference feature image; and at least one processor to: determine a feature in an image is a non-displayable feature by comparing the feature to the reference feature image; and generate a masked image, the masked image to mask the non-displayable feature based on the non-displayable feature not allowed to be displayed when captured from the fixed geographic location of the stationary tracker, and the masked image to display a displayable feature in the image; and a wireless interface to detect a wireless tag located on a tag bearer, the at least one processor to determine the tag bearer is the displayable feature in the image based on the wireless tag.
G06F 21/84 - Protecting input, output or interconnection devices output devices, e.g. displays or monitors
G06K 9/32 - Aligning or centering of the image pick-up or image-field
G06K 9/46 - Extraction of features or characteristics of the image
G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules
G06N 5/00 - Computing arrangements using knowledge-based models
H04W 4/02 - Services making use of location information
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
A ray is cast into a volume described by a volumetric data structure, which describes the volume at a plurality of levels of detail. A first entry in the volumetric data structure includes a first set of bits representing voxels at a lowest one of the plurality of levels of detail, and values of the first set of bits indicate whether a corresponding one of the voxels is at least partially occupied by respective geometry. A set of second entries in the volumetric data structure describe voxels at a second level of detail, which represent subvolumes of the voxels at the first lowest level of detail. The ray is determined to pass through a particular subset of the voxels at the first level of detail and at least a particular one of the particular subset of voxels is determined to be occupied by geometry.
Methods, systems, apparatus, and articles of manufacture to reduce memory latency when fetching pixel kernels are disclosed. An example apparatus includes first interface circuitry to receive a first request from a hardware accelerator at a first time including first coordinates of a first pixel disposed in a first image block, second interface circuitry to receive a second request including second coordinates from the hardware accelerator at a second time after the first time, and kernel retriever circuitry to, in response to the second request, determine whether the first image block is in cache storage based on a mapping of the second coordinates to a block tag, and, in response to determining that the first image block is in the cache storage, access, in parallel, two or more memory devices associated with the cache storage to transfer a plurality of image blocks including the first image block to the hardware accelerator.
The present application provides a method of corner detection and an image processing system for detecting corners in an image. The preferred implementation is in software using enabling and reusable hardware features in the underlying vector processor architecture. The advantage of this combined software and programmable processor datapath hardware is that the same hardware used for the FAST algorithm can also be readily applied to a variety of other computational tasks, not limited to image processing.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
20.
Apparatus, systems, and methods for low power computational imaging
The present application discloses a computing device that can provide a low-power, highly capable computing platform for computational imaging. The computing device can include one or more processing units, for example one or more vector processors and one or more hardware accelerators, an intelligent memory fabric, a peripheral device, and a power management module. The computing device can communicate with external devices, such as one or more image sensors, an accelerometer, a gyroscope, or any other suitable sensor devices.
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 13/28 - Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access, cycle steal
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
21.
Methods and apparatus to operate a mobile camera for low-power usage
A disclosed example to operate a mobile camera includes recognizing a first feature in first sensor data in response to the first feature being detected in the first sensor data; transitioning the mobile camera from a first feature detection state to a second feature detection state in response to the recognizing of the first feature, the mobile camera to operate using higher power consumption in second feature detection state than in the first feature detection state; recognizing a second feature in second sensor data in the second feature detection state; and sending to an external device at least one of first metadata corresponding to the first feature or second metadata corresponding to the second feature.
H04N 7/18 - Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 40/20 - Movements or behaviour, e.g. gesture recognition
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
H04N 23/65 - Control of camera operation in relation to power supply
H04N 23/45 - Cameras or camera modules comprising electronic image sensorsControl thereof for generating image signals from two or more image sensors being of different type or operating in different modes, e.g. with a CMOS sensor for moving images in combination with a charge-coupled device [CCD] for still images
H04N 23/61 - Control of cameras or camera modules based on recognised objects
H04N 23/611 - Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
H04N 23/667 - Camera operation mode switching, e.g. between still and video, sport and normal or high and low resolution modes
G06F 16/16 - File or folder operations, e.g. details of user interfaces specifically adapted to file systems
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocodersCoding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 25/78 - Detection of presence or absence of voice signals
G10L 15/16 - Speech classification or search using artificial neural networks
G06V 10/22 - Image preprocessing by selection of a specific region containing or referencing a patternLocating or processing of specific regions to guide the detection or recognition
Methods, apparatus, systems and articles of manufacture to compress data are disclosed. An example apparatus includes a data slicer to split a dataset into a plurality of blocks of data; a data processor to select a first compression technique for a first block of the plurality of blocks of data based on first characteristics of the first block; and select a second compression technique for a second block of the plurality of blocks of data based on second characteristics of the second block; a first compressor to compress the first block using the first compression technique to generate a first compressed block of data; a second compressor to compress the second block using the second compression technique to generate a second compressed block of data; and a header generator to generate a first header identifying the first compression technique and a second header identifying the second compression technique.
H03M 7/30 - CompressionExpansionSuppression of unnecessary data, e.g. redundancy reduction
H03M 7/46 - Conversion to or from run-length codes, i.e. by representing the number of consecutive digits, or groups of digits, of the same kind by a code word and a digit indicative of that kind
H03M 7/40 - Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
A view of geometry captured in image data generated by an imaging sensor is compared with a description of the geometry in a volumetric data structure. The volumetric data structure describes the volume at a plurality of levels of detail and includes entries describing voxels defining subvolumes of the volume at multiple levels of detail. The volumetric data structure includes a first entry to describe voxels at a lowest one of the levels of detail and further includes a number of second entries to describe voxels at a higher, second level of detail, the voxels at the second level of detail representing subvolumes of the voxels at the first level of detail. Each of these entries include bits to indicate whether a corresponding one of the voxels is at least partially occupied with the geometry. One or more of these entries are used in the comparison with the image data.
The present application relates generally to a parallel processing device. The parallel processing device can include a plurality of processing elements, a memory subsystem, and an interconnect system. The memory subsystem can include a plurality of memory slices, at least one of which is associated with one of the plurality of processing elements and comprises a plurality of random access memory (RAM) tiles, each tile having individual read and write ports. The interconnect system is configured to couple the plurality of processing elements and the memory subsystem. The interconnect system includes a local interconnect and a global interconnect.
G06F 12/00 - Accessing, addressing or allocating within memory systems or architectures
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G09G 5/36 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of individual graphic patterns using a bit-mapped memory
G09G 5/397 - Arrangements specially adapted for transferring the contents of two or more bit-mapped memories to the screen simultaneously, e.g. for mixing or overlay
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
Methods, apparatus, systems and articles of manufacture to store and access multi-dimensional data are disclosed. An example apparatus includes a memory; a memory allocator to allocate part of the memory for storage of a multi-dimensional data object; and a storage element organizer to: separate the multi-dimensional data into storage elements; store the storage elements in the memory, the stored storage elements being selectively executable; store starting memory address locations for the storage elements in an array in the memory, the array to facilitate selectable access of data of the stored elements; store a pointer for the array into the memory.
Methods, systems, apparatus and articles of manufacture to identify features within an image are disclosed herein. An example apparatus includes a horizontal cost (HCOST) engine to apply a first row of pixels of a macroblock to an input of a first HCOST unit, the first HCOST unit including a number of difference calculators; and a difference calculator engine to apply corresponding rows of pixels of a search window of a source image to corresponding ones of the number of difference calculators of the first HCOST unit, the corresponding ones of the number of difference calculators to calculate respective sums of absolute difference (SAD) values between (a) the first row of pixels of the macroblock and (b) the corresponding rows of pixels of the search window.
H04N 19/436 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/517 - Processing of motion vectors by encoding
A neural network model is trained, where the training includes multiple training iterations. Weights of a particular layer of the neural network are pruned during a forward pass of a particular one of the training iterations. During the same forward pass of the particular training iteration, values of weights of the particular layer are quantized to determine a quantized-sparsified subset of weights for the particular layer. A compressed version of the neural network model is generated from the training based at least in part on the quantized-sparsified subset of weights.
A grammar is used in a grammatical evolution of a set of parent neural network models to generate a set of child neural network models. A generation of neural network models is tested based on a set of test data, where the generation includes the set of child neural network models. Respective values for each one of a plurality of attributes are determined for each neural network in the generation, where one of the attributes includes a validation accuracy value determined from the test. Multi-objective optimization is performed based on the values of the plurality of attributes for the generation of neural networks and a subset of the generation of neural network models is selected based on the results of the multi-objective optimization.
Systems and methods are provided for image classification using histograms of oriented gradients (HoG) in conjunction with a trainer. The efficiency of the process is greatly increased by first establishing a bitmap which identifies a subset of the pixels in the HoG window as including relevant foreground information, and limiting the HoG calculation and comparison process to only the pixels included in the bitmap.
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06V 10/50 - Extraction of image or video features by performing operations within image blocksExtraction of image or video features by using histograms, e.g. histogram of oriented gradients [HoG]Extraction of image or video features by summing image-intensity valuesProjection analysis
G06V 10/94 - Hardware or software architectures specially adapted for image or video understanding
G06V 30/194 - References adjustable by an adaptive method, e.g. learning
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06V 40/10 - Human or animal bodies, e.g. vehicle occupants or pedestriansBody parts, e.g. hands
30.
Systems and methods for distributed training of deep learning models
Systems and methods for distributed training of deep learning models are disclosed. An example local device to train deep learning models includes a reference generator to label input data received at the local device to generate training data, a trainer to train a local deep learning model and to transmit the local deep learning model to a server that is to receive a plurality of local deep learning models from a plurality of local devices, the server to determine a set of weights for a global deep learning model, and an updater to update the local deep learning model based on the set of weights received from the server.
Methods, apparatus, systems and articles of manufacture to perform dot product calculations using sparse vectors are disclosed. An example apparatus includes means for generating a mask vector based on a first logic operation on a difference vector and an inverse of a control vector, the control vector based on a first bitmap of a first sparse vector and a second bitmap of a second sparse vector; means for generating a first product of a third value from the first sparse vector and a fourth value from the second sparse vector, the third value based on (i) the mask vector and (ii) a second sparsity map based on the first sparse vector, the fourth value corresponding to (i) the mask vector and (ii) a second sparsity map corresponding to the second sparse vector; and means for adding the first product to a second product of a previous iteration.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
A particular voxel is identified within a volume and a hash table is used to obtain volumetric data describing the particular voxel within the volume. Values of x-, y- and z-coordinates in the volume associated with the particular voxel are determined an index value associated with the particular voxel is determined according to a hashing algorithm, where the index value is determined from summing weighted values of the x-, y- and z-coordinates, and the weighted values are based on a variable value corresponding to a dimension of the volume. A particular entry is identified in the hash table based on the index value, where the particular entry includes volumetric data, and the volumetric data identifies, for the particular voxel, whether the particular voxel is occupied.
An output of a first one of a plurality of layers within a neural network is identified. A bitmap is determined from the output, the bitmap including a binary matrix. A particular subset of operations for a second one of the plurality of layers is determined to be skipped based on the bitmap. Operations are performed for the second layer other than the particular subset of operations, while the particular subset of operations are skipped.
An example stationary tracker includes: memory to store fixed geographic location information indicative of a fixed geographic location of the stationary tracker, and to store a reference feature image; and at least one processor to: determine a feature in an image is a non-displayable feature by comparing the feature to the reference feature image; and generate a masked image, the masked image to mask the non-displayable feature based on the non-displayable feature not allowed to be displayed when captured from the fixed geographic location of the stationary tracker, and the masked image to display a displayable feature in the image.
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
A ray is cast into a volume described by a volumetric data structure, which describes the volume at a plurality of levels of detail. A first entry in the volumetric data structure includes a first set of bits representing voxels at a lowest one of the plurality of levels of detail, and values of the first set of bits indicate whether a corresponding one of the voxels is at least partially occupied by respective geometry. A set of second entries in the volumetric data structure describe voxels at a second level of detail, which represent subvolumes of the voxels at the first lowest level of detail. The ray is determined to pass through a particular subset of the voxels at the first level of detail and at least a particular one of the particular subset of voxels is determined to be occupied by geometry.
Methods, systems, articles of manufacture and apparatus to generate digital scenes are disclosed. An example apparatus to generate labelled models includes a map builder to generate a three-dimensional (3D) model of an input image, a grouping classifier to identify a first zone of the 3D model corresponding to a first type of grouping classification, a human model builder to generate a quantity of placeholder human models corresponding to the first zone, a coordinate engine to assign the quantity of placeholder human models to respective coordinate locations of the first zone, the respective coordinate locations assigned based on the first type of grouping classification, a model characteristics modifier to assign characteristics associated with an aspect type to respective ones of the quantity of placeholder human models, and an annotation manager to associate the assigned characteristics as label data for respective ones of the quantity of placeholder human models.
A disclosed example to operate a mobile camera includes recognizing a first feature in first sensor data in response to the first feature being detected in the first sensor data; transitioning the mobile camera from a first feature detection state to a second feature detection state in response to the recognizing of the first feature, the mobile camera to operate using higher power consumption in second feature detection state than in the first feature detection state; recognizing a second feature in second sensor data in the second feature detection state; and sending to an external device at least one of first metadata corresponding to the first feature or second metadata corresponding to the second feature.
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
G10L 19/00 - Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocodersCoding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 25/51 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 25/78 - Detection of presence or absence of voice signals
G10L 15/16 - Speech classification or search using artificial neural networks
38.
Dot product calculators and methods of operating the same
Methods, apparatus, systems and articles of manufacture to perform dot product calculations using sparse vectors are disclosed. An example dot product calculator includes a first logic AND gate to perform a first logic AND operation with a first input vector and a second input vector, the first logic AND gate to output a control vector; a second logic AND gate to perform a second logic AND operation with a difference vector and an inverse of the control vector, the second logic AND gate to output a mask vector; a third logic AND gate to output a first vector; a first counter to generate a first ones count based on a first total number of ones of the first vector; a fourth logic AND gate to output a second vector; a second counter to generate a second ones count; and a multiplier to generate a product.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
Methods, apparatus, systems and articles of manufacture to store and access multi-dimensional data are disclosed. An example apparatus includes a memory; a memory allocator to allocate part of the memory for storage of a multi-dimensional data object; and a storage element organizer to: separate the multi-dimensional data into storage elements; store the storage elements in the memory, the stored storage elements being selectively executable; store starting memory address locations for the storage elements in an array in the memory, the array to facilitate selectable access of data of the stored elements; store a pointer for the array into the memory.
Methods, systems, apparatus and articles of manufacture to identify features within an image are disclosed herein. An example apparatus includes a horizontal cost (HCOST) engine to apply a first row of pixels of a macroblock to an input of a first HCOST unit, the first HCOST unit including a number of difference calculators; and a difference calculator engine to apply corresponding rows of pixels of a search window of a source image to corresponding ones of the number of difference calculators of the first HCOST unit, the corresponding ones of the number of difference calculators to calculate respective sums of absolute difference (SAD) values between (a) the first row of pixels of the macroblock and (b) the corresponding rows of pixels of the search window.
Methods, apparatus, systems and articles of manufacture to compress data are disclosed. An example apparatus includes an off-chip memory to store data; a data slicer to split a dataset into a plurality of blocks of data; a data processor to select a first compression technique for a first block of the plurality of blocks of data based on first characteristics of the first block; and select a second compression technique for a second block of the plurality of blocks of data based on second characteristics of the second block; a first compressor to compress the first block using the first compression technique to generate a first compressed block of data; a second compressor to compress the second block using the second compression technique to generate a second compressed block of data; a header generator to generate a first header identifying the first compression technique and a second header identifying the second compression technique; and an interface to transmit the first compressed block of data with the first header and the second compressed block of data with the second header to be stored in the off chip memory.
H03M 7/30 - CompressionExpansionSuppression of unnecessary data, e.g. redundancy reduction
H03M 7/46 - Conversion to or from run-length codes, i.e. by representing the number of consecutive digits, or groups of digits, of the same kind by a code word and a digit indicative of that kind
H03M 7/40 - Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
42.
Apparatus, systems, and methods for low power computational imaging
The present application discloses a computing device that can provide a low-power, highly capable computing platform for computational imaging. The computing device can include one or more processing units, for example one or more vector processors and one or more hardware accelerators, an intelligent memory fabric, a peripheral device, and a power management module. The computing device can communicate with external devices, such as one or more image sensors, an accelerometer, a gyroscope, or any other suitable sensor devices.
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06F 13/28 - Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access, cycle steal
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
43.
Methods, systems and apparatus to optimize pipeline execution
Methods, apparatus, systems, and articles of manufacture to optimize pipeline execution are disclosed. An example apparatus includes a cost computation manager to determine a value associated with a first location of a first pixel of a first image and a second location of a second pixel of a second image by calculating a matching cost between the first location and the second location, and an aggregation generator to generate a disparity map including the value, and determine a minimum value based on the disparity map corresponding to a difference in horizontal coordinates between the first location and the second location.
A neural network model is trained, where the training includes multiple training iterations. Weights of a particular layer of the neural network are pruned during a forward pass of a particular one of the training iterations. During the same forward pass of the particular training iteration, values of weights of the particular layer are quantized to determine a quantized-sparsified subset of weights for the particular layer. A compressed version of the neural network model is generated from the training based at least in part on the quantized-sparsified subset of weights.
A vector processor is disclosed including a variety of variable-length instructions. Computer-implemented methods are disclosed for efficiently carrying out a variety of operations in a time-conscious, memory-efficient, and power-efficient manner. Methods for more efficiently managing a buffer by controlling the threshold based on the length of delay line instructions are disclosed. Methods for disposing multi-type and multi-size operations in hardware are disclosed. Methods for condensing look-up tables are disclosed. Methods for in-line alteration of variables are disclosed.
G06F 9/30 - Arrangements for executing machine instructions, e.g. instruction decode
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06F 1/3203 - Power management, i.e. event-based initiation of a power-saving mode
Methods, systems, apparatus, and articles of manufacture to reduce memory latency when fetching pixel kernels are disclosed. An example apparatus includes a prefetch kernel retriever to generate a block tag based on a first request from a hardware accelerator, the first request including first coordinates of a first pixel disposed in a first image block, a memory interface engine to store the first image block including a plurality of pixels including the pixel in a cache storage based on the block tag, and a kernel retriever to access two or more memory devices included in the cache storage in parallel to transfer a plurality of image blocks including the first image block when a second request is received including second coordinates of a second pixel disposed in the first image block.
An example apparatus includes a memory, a data writer to write received first data into the memory in a first order, and a data reader to read the first data from the memory in a second order, wherein the data writer is to write second data into the memory in the second order.
A view of geometry captured in image data generated by an imaging sensor is compared with a description of the geometry in a volumetric data structure. The volumetric data structure describes the volume at a plurality of levels of detail and includes entries describing voxels defining subvolumes of the volume at multiple levels of detail. The volumetric data structure includes a first entry to describe voxels at a lowest one of the levels of detail and further includes a number of second entries to describe voxels at a higher, second level of detail, the voxels at the second level of detail representing subvolumes of the voxels at the first level of detail. Each of these entries include bits to indicate whether a corresponding one of the voxels is at least partially occupied with the geometry. One or more of these entries are used in the comparison with the image data.
Methods, apparatus, systems and articles of manufacture to perform dot product calculations using sparse vectors are disclosed. An example dot product calculator includes a counter to determine a trailing binary count of a control vector, the control vector corresponding to a first result of a first logic AND operation on a first bitmap of a first sparse vector and a second bitmap of a second sparse vector. The example dot product calculator further includes a mask generator to generate a mask vector based on the trailing binary count. The example dot product calculator further includes an interface to access a first value of the first sparse vector based on a second result of a second logic AND operation on the first bitmap and the mask vector and access a second value of the second sparse vector based on a third result of a third logic AND operation on the second bitmap and the mask vector. The example dot product calculator further includes a multiplier to multiply the first value with the second value to generate a product.
Methods, apparatus, systems and articles of manufacture to perform dot product calculations using sparse vectors are disclosed. An example dot product calculator includes a counter to determine a trailing binary count of a control vector, the control vector corresponding to a first result of a first logic AND operation on a first bitmap of a first sparse vector and a second bitmap of a second sparse vector. The example dot product calculator further includes a mask generator to generate a mask vector based on the trailing binary count. The example dot product calculator further includes an interface to access a first value of the first sparse vector based on a second result of a second logic AND operation on the first bitmap and the mask vector and access a second value of the second sparse vector based on a third result of a third logic AND operation on the second bitmap and the mask vector. The example dot product calculator further includes a multiplier to multiply the first value with the second value to generate a product.
G06F 7/544 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices for evaluating functions by calculation
A grammar is used in a grammatical evolution of a set of parent neural network models to generate a set of child neural network models. A generation of neural network models is tested based on a set of test data, where the generation includes the set of child neural network models. Respective values for each one of a plurality of attributes are determined for each neural network in the generation, where one of the attributes includes a validation accuracy value determined from the test. Multi-objective optimization is performed based on the values of the plurality of attributes for the generation of neural networks and a subset of the generation of neural network models is selected based on the results of the multi-objective optimization.
Examples to selectively generate a masked image include: a convolutional neural network detector to detect a first feature and a second feature in an image captured by a camera; a feature recognizer to determine the first feature is a displayable feature and the second feature is a non-displayable feature by comparing the first and second features of the image to reference feature images stored in a memory; and a blur generator to generate the masked image to display the displayable feature and mask the non-displayable feature.
Methods, apparatus, systems, and articles of manufacture are disclosed to improve convolution efficiency of a convolution neural network (CNN) accelerator. An example hardware accelerator includes a hardware data path element (DPE) in a DPE array, the hardware DPE including an accumulator, and a multiplier coupled to the accumulator, the multiplier to multiply first inputs including an activation value and a filter coefficient value to generate a first convolution output when the hardware DPE is in a convolution mode, and a controller coupled to the DPE array, the controller to adjust the hardware DPE from the convolution mode to a pooling mode by causing at least one of the multiplier or the accumulator to generate a second convolution output based on second inputs, the second inputs including an output location value of a pool area, at least one of the first inputs different from at least one of the second inputs.
A raycaster performs a raycasting algorithm, where the raycasting algorithm takes, as an input, a sparse hierarchical volumetric data structure. Performing the raycasting algorithm includes casting a plurality of rays from a reference point into the 3D volume, and, for each of the plurality of rays, traversing the ray to determine whether voxels in the set of voxels are intersected by the ray and are occupied, where the ray is to be traversed according to an approximate traversal algorithm.
Example methods, apparatus, systems and articles of manufacture (e.g., physical storage media) to implement video surveillance with neural networks are disclosed. Example systems disclosed herein include a database to store records of operator-labeled video segments (e.g., as records of operator-labeled video segments). The operator-labeled video segments include reference video segments and corresponding reference event labels describing the video segments. Disclosed example systems also include a neural network including a first instance of an inference engine, and a training engine to train the first instance of the inference engine based on a training set of the operator-labeled video segments obtained from the database, the first instance of the inference engine to infer events from the operator-labeled video segments included in the training set. Disclosed example systems further include a second instance of the inference engine to infer events from monitored video feeds, the second instance of the inference engine being based on the first instance of the inference engine.
G08B 13/196 - Actuation by interference with heat, light, or radiation of shorter wavelengthActuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
Example methods, apparatus, systems and articles of manufacture (e.g., physical storage media) to implement video surveillance with neural networks are disclosed. Example systems disclosed herein include a database to store records of operator-labeled video segments (e.g., as records of operator-labeled video segments). The operator-labeled video segments include reference video segments and corresponding reference event labels describing the video segments. Disclosed example systems also include a neural network including a first instance of an inference engine, and a training engine to train the first instance of the inference engine based on a training set of the operator-labeled video segments obtained from the database, the first instance of the inference engine to infer events from the operator-labeled video segments included in the training set. Disclosed example systems further include a second instance of the inference engine to infer events from monitored video feeds, the second instance of the inference engine being based on the first instance of the inference engine.
The present application relates generally to a parallel processing device. The parallel processing device can include a plurality of processing elements, a memory subsystem, and an interconnect system. The memory subsystem can include a plurality of memory slices, at least one of which is associated with one of the plurality of processing elements and comprises a plurality of random access memory (RAM) tiles, each tile having individual read and write ports. The interconnect system is configured to couple the plurality of processing elements and the memory subsystem. The interconnect system includes a local interconnect and a global interconnect.
G06F 12/00 - Accessing, addressing or allocating within memory systems or architectures
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G09G 5/36 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of individual graphic patterns using a bit-mapped memory
G09G 5/397 - Arrangements specially adapted for transferring the contents of two or more bit-mapped memories to the screen simultaneously, e.g. for mixing or overlay
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
A pruned version of a neural network is generated by determining pruned versions of each a plurality of layers of the network. The pruned version of each layer is determined by sorting a set of channels of the layer based on respective weight values of each channel in the set. A percentage of the set of channels are pruned based on the sorting to form a thinned version of the layer. Accuracy of a thinned version of the neural network is tested, where the thinned version of the neural network includes the thinned version of the layer. The thinned version of the layer is used to generate the pruned version of the layer based on the accuracy of the thinned version of the neural network exceeding a threshold accuracy value. A pruned version of the neural network is generated to include the pruned versions of the plurality of layers.
Methods, apparatus, systems and articles of manufacture to reconstruct scenes using convolutional neural networks are disclosed. An example apparatus includes a sensor data acquirer to acquire ground truth data representing an environment, an environment detector to identify an environmental characteristic of the environment, a synthetic database builder to apply noise to the ground truth data to form a training set, a model builder to train a machine learning model using the training set and the ground truth data, and a model adjustor to modify the machine learning model to include residual OR-gate connections intermediate respective layers of the machine learning model. The synthetic database builder is further to store the machine learning model in association with the environmental characteristic of the environment.
A machine learning system is provided to enhance various aspects of machine learning models. In some aspects, a substantially photorealistic three-dimensional (3D) graphical model of an object is accessed and a set of training images of the 3D graphical mode are generated, the set of training images generated to add imperfections and degrade photorealistic quality of the training images. The set of training images are provided as training data to train an artificial neural network.
Systems and methods are provided for image classification using histograms of oriented gradients (HoG) in conjunction with a trainer. The efficiency of the process is greatly increased by first establishing a bitmap which identifies a subset of the pixels in the HoG window as including relevant foreground information, and limiting the HoG calculation and comparison process to only the pixels included in the bitmap.
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/46 - Extraction of features or characteristics of the image
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
62.
METHODS, SYSTEMS, ARTICLES OF MANUFACTURE AND APPARATUS TO GENERATE DIGITAL SCENES
Methods, systems, articles of manufacture and apparatus to generate digital scenes are disclosed. An example apparatus to generate labelled models includes a map builder to generate a three-dimensional (3D) model of an input image, a grouping classifier to identify a first zone of the 3D model corresponding to a first type of grouping classification, a human model builder to generate a quantity of placeholder human models corresponding to the first zone, a coordinate engine to assign the quantity of placeholder human models to respective coordinate locations of the first zone, the respective coordinate locations assigned based on the first type of grouping classification, a model characteristics modifier to assign characteristics associated with an aspect type to respective ones of the quantity of placeholder human models, and an annotation manager to associate the assigned characteristics as label data for respective ones of the quantity of placeholder human models.
An example includes sending (502) first weight values to first client devices; accessing (504) sets of updated weight values provided by the first client devices, the updated weight values generated by the first client devices training respective first convolutional neural networks, CNNs, based on: the first weight values, and sensor data generated at the client devices; testing (506) performance in a second CNN of at least one of: the sets of the updated weight values, or a combination of ones of the updated weight values from the sets of the updated weight values; selecting (512) server-synchronized weight values from the at least one of: the sets of the updated weight values, or a combination of ones of the updated weight values from the sets of the updated weight values; and sending (518) the server- synchronized weight values to at least one of: at least some of the first client devices, or second client devices.
Cache memory mapping techniques are presented. A cache may contain an index configuration register. The register may configure the locations of an upper index portion and a lower index portion of a memory address. The portions may be combined to create a combined index. The configurable split-index addressing structure may be used, among other applications, to reduce the rate of cache conflicts occurring between multiple processors decoding the video frame in parallel.
G06F 12/00 - Accessing, addressing or allocating within memory systems or architectures
G06F 13/00 - Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
G06F 12/0884 - Parallel mode, e.g. in parallel with main memory or CPU
G06F 12/0804 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
G06F 12/0842 - Multiuser, multiprocessor or multiprocessing cache systems for multiprocessing or multitasking
G06F 12/0875 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
G06F 12/0895 - Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
G06F 12/0811 - Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
65.
METHODS AND APPARATUS TO OPERATE A MOBILE CAMERA FOR LOW-POWER USAGE
An example mobile camera includes a first convolutional neural network to recognize a first feature in first sensor data in response to the first feature being detected in the first sensor data; a state transitioner to transition the mobile camera from a first feature detection state to a second feature detection state in response to the first convolutional neural network recognizing the first feature, the mobile camera to operate using higher power consumption in the second feature detection state than in the first feature detection state; a second convolutional neural network to recognize a second feature in second sensor data in the second feature detection state; and a communications interface to send to an external device at least one of first metadata corresponding to the first feature or second metadata corresponding to the second feature.
An output of a first one of a plurality of layers within a neural network is identified. A bitmap is determined from the output, the bitmap including a binary matrix. A particular subset of operations for a second one of the plurality of layers is determined to be skipped based on the bitmap. Operations are performed for the second layer other than the particular subset of operations, while the particular subset of operations are skipped.
A ray is cast into a volume described by a volumetric data structure, which describes the volume at a plurality of levels of detail. A first entry in the volumetric data structure includes a first set of bits representing voxels at a lowest one of the plurality of levels of detail, and values of the first set of bits indicate whether a corresponding one of the voxels is at least partially occupied by respective geometry. A set of second entries in the volumetric data structure describe voxels at a second level of detail, which represent subvolumes of the voxels at the first lowest level of detail. The ray is determined to pass through a particular subset of the voxels at the first level of detail and at least a particular one of the particular subset of voxels is determined to be occupied by geometry.
A volumetric data structure models a particular volume representing the particular volume at a plurality of levels of detail. A first entry in the volumetric data structure includes a first set of bits representing voxels at a first level of detail, the first level of detail includes the lowest level of detail in the volumetric data structure, values of the first set of bits indicate whether a corresponding one of the voxels is at least partially occupied by respective geometry, where the volumetric data structure further includes a number of second entries representing voxels at a second level of detail higher than the first level of detail, the voxels at the second level of detail represent subvolumes of volumes represented by voxels at the first level of detail, and the number of second entries corresponds to a number of bits in the first set of bits with values indicating that a corresponding voxel volume is occupied.
The present application relates generally to a parallel processing device. The parallel processing device can include a plurality of processing elements, a memory subsystem, and an interconnect system. The memory subsystem can include a plurality of memory slices, at least one of which is associated with one of the plurality of processing elements and comprises a plurality of random access memory (RAM) tiles, each tile having individual read and write ports. The interconnect system is configured to couple the plurality of processing elements and the memory subsystem. The interconnect system includes a local interconnect and a global interconnect.
G06F 12/00 - Accessing, addressing or allocating within memory systems or architectures
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G09G 5/36 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of individual graphic patterns using a bit-mapped memory
G09G 5/397 - Arrangements specially adapted for transferring the contents of two or more bit-mapped memories to the screen simultaneously, e.g. for mixing or overlay
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
Cache memory mapping techniques are presented. A cache may contain an index configuration register. The register may configure the locations of an upper index portion and a lower index portion of a memory address. The portions may be combined to create a combined index. The configurable split-index addressing structure may be used, among other applications, to reduce the rate of cache conflicts occurring between multiple processors decoding the video frame in parallel.
G06F 12/00 - Accessing, addressing or allocating within memory systems or architectures
G06F 13/00 - Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
G06F 12/0884 - Parallel mode, e.g. in parallel with main memory or CPU
G06F 12/0804 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
G06F 12/0842 - Multiuser, multiprocessor or multiprocessing cache systems for multiprocessing or multitasking
G06F 12/0875 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
G06F 12/0895 - Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
G06F 12/0811 - Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
71.
Apparatus, systems, and methods for low power computational imaging
The present application discloses a computing device that can provide a low-power, highly capable computing platform for computational imaging. The computing device can include one or more processing units, for example one or more vector processors and one or more hardware accelerators, an intelligent memory fabric, a peripheral device, and a power management module. The computing device can communicate with external devices, such as one or more image sensors, an accelerometer, a gyroscope, or any other suitable sensor devices.
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 13/28 - Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access, cycle steal
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
The present application provides a method of randomly accessing a compressed structure in memory without the need for retrieving and decompressing the entire compressed structure.
The present application provides a method of corner detection and an image processing system for detecting corners in an image. The preferred implementation is in software using enabling and reusable hardware features in the underlying vector processor architecture. The advantage of this combined software and programmable processor datapath hardware is that the same hardware used for the FAST algorithm can also be readily applied to a variety of other computational tasks, not limited to image processing.
A vector processor is disclosed including a variety of variable-length instructions. Computer-implemented methods are disclosed for efficiently carrying out a variety of operations in a time-conscious, memory-efficient, and power-efficient manner. Methods for more efficiently managing a buffer by controlling the threshold based on the length of delay line instructions are disclosed. Methods for disposing multi-type and multi-size operations in hardware are disclosed. Methods for condensing look-up tables are disclosed. Methods for in-line alteration of variables are disclosed.
G06F 9/30 - Arrangements for executing machine instructions, e.g. instruction decode
G06F 1/3203 - Power management, i.e. event-based initiation of a power-saving mode
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
Systems and methods are provided for rendering of a dual eye-specific display. The system tracks the user's eye movements and/or positions, in some implementations, based on electroencephalography (EEG) of the user, to correctly label the central (foveal) and peripheral (extra-foveal) areas of the display. Foveal data is fully rendered while extra-foveal data is reduced in resolution and, in some implementations, shared between the two displays.
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
G09G 5/00 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
H04N 19/597 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
H04N 19/132 - Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
H04N 19/17 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/426 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements using memory downsizing methods
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/383 - Image reproducers using viewer tracking for tracking with gaze detection, i.e. detecting the lines of sight of the viewer's eyes
Systems and methods are provided for image classification using histograms of oriented gradients (HoG) in conjunction with a trainer. The efficiency of the process is greatly increased by first establishing a bitmap which identifies a subset of the pixels in the HoG window as including relevant foreground information, and limiting the HoG calculation and comparison process to only the pixels included in the bitmap.
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/46 - Extraction of features or characteristics of the image
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
77.
Systems and methods for providing an image classifier
Systems and methods are provided for image classification using histograms of oriented gradients (HoG) in conjunction with a trainer. The efficiency of the process is greatly increased by first establishing a bitmap which identifies a subset of the pixels in the HoG window as including relevant foreground information, and limiting the HoG calculation and comparison process to only the pixels included in the bitmap.
G06K 9/62 - Methods or arrangements for recognition using electronic means
G06K 9/66 - Methods or arrangements for recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references, e.g. resistor matrix references adjustable by an adaptive method, e.g. learning
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
78.
Systems, methods, and apparatuses for histogram of gradients
One of the challenges in bringing computational imaging to a mass market is that computational imaging is inherently computationally expensive. The computational challenges associated with computational imaging are apparent with the computation of a histogram of gradient descriptors. Oftentimes, generating a histogram of gradient descriptors involves computing gradients of an image, binning the gradients according to their orientation, and, optionally, normalizing the bins using a non-linear function. Because each of these operations is expensive, the histogram of gradient descriptor computations is generally computationally expensive and is difficult to implement in a power efficient manner for mobile applications. The present application discloses a computing device that can provide a low-power, highly capable computing platform for computing a histogram of gradient descriptors.
Cache memory mapping techniques are presented. A cache may contain an index configuration register. The register may configure the locations of an upper index portion and a lower index portion of a memory address. The portions may be combined to create a combined index. The configurable split-index addressing structure may be used, among other applications, to reduce the rate of cache conflicts occurring between multiple processors decoding the video frame in parallel.
G06F 12/0884 - Parallel mode, e.g. in parallel with main memory or CPU
G06F 12/0804 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
G06F 12/0842 - Multiuser, multiprocessor or multiprocessing cache systems for multiprocessing or multitasking
G06F 12/0875 - Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
G06F 12/0895 - Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
G06F 12/0811 - Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
One of the challenges in bringing computational imaging to a mass market is that computational imaging is inherently computationally expensive. The computational challenges associated with computational imaging are apparent with the computation of a histogram of gradient descriptors. Oftentimes, generating a histogram of gradient descriptors involves computing gradients of an image, binning the gradients according to their orientation, and, optionally, normalizing the bins using a non-linear function. Because each of these operations is expensive, the histogram of gradient descriptor computations is generally computationally expensive and is difficult to implement in a power efficient manner for mobile applications. The present application discloses a computing device that can provide a low-power, highly capable computing platform for computing a histogram of gradient descriptors.
The disclosed subject matter includes an apparatus configured to remove a shading effect from an image. The apparatus can include one or more interfaces configured to provide communication with an imaging module that is configured to capture the image, and a processor, in communication with the one or more interfaces, configured to run a module stored in memory. The module is configured to receive the image captured by the imaging module under a first lighting spectrum, receive a per-unit correction mesh for adjusting images captured by the imaging module under a second lighting spectrum, determine a correction mesh for the image captured under the first lighting spectrum based on the per-unit correction mesh for the second lighting spectrum, and operate the correction mesh on the image to remove the shading effect from the image.
The disclosed embodiments include an apparatus implemented in a semiconductor integrated chip. The apparatus is configured to operate a composite function, comprising a first function and a second function, on a first patch of an image. The apparatus includes a first function operator configured to operate the first function on the group of pixel values to provide a first processed group of pixel values. The apparatus also includes a delay system configured to maintain the first processed group of pixel values for a predetermined period of time to provide a delayed processed group of pixel values. The apparatus further includes a second function operator configured to operate a second function on at least a second processed group of pixels and the delayed processed group to determine an output of the composite function.
A vector processor is disclosed including a variety of variable-length instructions. Computer-implemented methods are disclosed for efficiently carrying out a variety of operations in a time-conscious, memory-efficient, and power-efficient manner. Methods for more efficiently managing a buffer by controlling the threshold based on the length of delay line instructions are disclosed. Methods for disposing multi-type and multi-size operations in hardware are disclosed. Methods for condensing look-up tables are disclosed. Methods for in-line alteration of variables are disclosed.
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
84.
Apparatus, systems, and methods for low power computational imaging
The present application discloses a computing device that can provide a low-power, highly capable computing platform for computational imaging. The computing device can include one or more processing units, for example one or more vector processors and one or more hardware accelerators, an intelligent memory fabric, a peripheral device, and a power management module. The computing device can communicate with external devices, such as one or more image sensors, an accelerometer, a gyroscope, or any other suitable sensor devices.
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G06F 13/28 - Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access, cycle steal
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
85.
Apparatus, systems, and methods for providing computational imaging pipeline
The present application relates generally to a parallel processing device. The parallel processing device can include a plurality of processing elements, a memory subsystem, and an interconnect system. The memory subsystem can include a plurality of memory slices, at least one of which is associated with one of the plurality of processing elements and comprises a plurality of random access memory (RAM) tiles, each tile having individual read and write ports. The interconnect system is configured to couple the plurality of processing elements and the memory subsystem. The interconnect system includes a local interconnect and a global interconnect.
G06F 12/00 - Accessing, addressing or allocating within memory systems or architectures
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
G09G 5/36 - Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of individual graphic patterns using a bit-mapped memory
G09G 5/397 - Arrangements specially adapted for transferring the contents of two or more bit-mapped memories to the screen simultaneously, e.g. for mixing or overlay
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
The present application relates generally to a parallel processing device. The parallel processing device can include a plurality of processing elements, a memory subsystem, and an interconnect system. The memory subsystem can include a plurality of memory slices, at least one of which is associated with one of the plurality of processing elements and comprises a plurality of random access memory (RAM) tiles, each tile having individual read and write ports. The interconnect system is configured to couple the plurality of processing elements and the memory subsystem. The interconnect system includes a local interconnect and a global interconnect.
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
G06F 15/167 - Interprocessor communication using a common memory, e.g. mailbox
G06F 9/38 - Concurrent instruction execution, e.g. pipeline or look ahead
The present application discloses a computing device that can provide a low-power, highly capable computing platform for computational imaging. The computing device can include one or more processing units, for example one or more vector processors and one or more hardware accelerators, an intelligent memory fabric, a peripheral device, and a power management module. The computing device can communicate with external devices, such as one or more image sensors, an accelerometer, a gyroscope, or any other suitable sensor devices.
G06T 1/20 - Processor architecturesProcessor configuration, e.g. pipelining
G06F 15/80 - Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
88.
Broadcast video decoder with reduced memory and processing requirements suitable for handheld and mobile applications
The present application relates to an apparatus for programmable video size reduction with dynamic image filtering for use in block-based video decoding system. The invention improves the image quality within low video memory requirements and allows for efficient decoding of higher resolution video to be displayed on a lower resolution display device.
H04N 19/51 - Motion estimation or motion compensation
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/61 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
H04N 19/117 - Filters, e.g. for pre-processing or post-processing
H04N 19/132 - Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
H04N 19/182 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
H04N 19/44 - Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
H04N 19/86 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
H04N 19/59 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
Hardware for performing sequences of arithmetic operations. The hardware comprises a scheduler operable to generate a schedule of instructions from a bitmap denoting whether an entry in a matrix is zero or not. An arithmetic circuit is provided which is configured to perform arithmetic operations on the matrix in accordance with the schedule.
The present application provides a method of randomly accessing a compressed structure in memory without the need for retrieving and decompressing the entire compressed structure.
The present application relates to the field of processors and in particular to the carrying out of arithmetic operations. Many of the computations performed by processors consist of a large number of simple operations. As a result, a multiplication operation may take a significant number of clock cycles to complete. The present application provides a processor having a trivial operand register, which is used in the carrying out of arithmetic or storage operations for data values stored in a data store.
The present application addresses a fundamental problem in the design of computing systems, that of minimizing the cost of memory access. This is a fundamental limitation on the design of computer systems as regardless of the memory technology or manner of connection to the processor, there is a maximum limitation on how much data can be transferred between processor and memory in a given time, this is the available memory bandwidth and the limitation of compute power by available memory bandwidth is often referred to as the memory-wall. The solution provided creates a map of a data structure to be compressed, the map representing the locations of non-trivial data values in the structure (e.g. non-zero values) and deleting the trivial data values from the structure to provide a compressed structure.
G06F 7/00 - Methods or arrangements for processing data by operating upon the order or content of the data handled
G06F 7/48 - Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state deviceMethods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using unspecified devices