A permission based media system to perform operations that include: presenting a first media object at a client device associated with a user account, the first media object including a reference that identifies the user account; receiving an input that selects the first media object from the client device; determining a permission of the user account based on the reference that identifies the user account; presenting a set of options based on the permission associated with the user account; receiving a selection of an option from among the set of options; and generating a second media object based on the first media object and the selection of the option, according to certain embodiments.
H04L 51/52 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
2.
IMAGE GENERATION USING SURFACE-BASED NEURAL SYNTHESIS
Aspects of the present disclosure involve a system and a method for performing operations comprising: receiving a two-dimensional continuous surface representation of a three-dimensional object, the continuous surface comprising a plurality of landmark locations; determining a first set of soft membership functions based on a relative location of points in the two-dimensional continuous surface representation and the landmark locations; receiving a two-dimensional input image, the input image comprising an image of the object; extracting a plurality of features from the input image using a feature recognition model; generating an encoded feature representation of the extracted features using the first set of soft membership functions; generating a dense feature representation of the extracted features from the encoded representation using a second set of soft membership functions; and processing the second set of soft membership functions and dense feature representation using a neural image decoder model to generate an output image.
G06T 5/50 - Image enhancement or restoration using two or more images, e.g. averaging or subtraction
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
Methods and systems are disclosed for using generative machine learning models to generate fashion items for avatars. The methods and systems present a graphical user interface (GUI) comprising icons representing different types of avatar fashion items. The methods and systems receive input that selects an individual icon corresponding to an individual avatar fashion item. The methods and systems, in response to receiving the input, receive a selection of an individual texture from a texture selection region, the texture selection region comprising a first set of predefined textures and an option associated with a second set of textures generated based on a prompt by a generative machine learning model. The methods and systems apply the individual avatar fashion item with the individual texture to an avatar depicted in the GUI.
G06T 11/60 - Editing figures and textCombining figures or text
A63F 13/53 - Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game
A63F 13/795 - Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories for finding other playersGame security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories for building a teamGame security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories for providing a buddy list
A63F 13/87 - Communicating with other players during game play, e.g. by e-mail or chat
G06F 3/04817 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
A method of dissipating heat generated by imaging devices and processing devices of a wearable electronic eyewear device includes providing a first heat sink thermally connecting the imaging devices to a frame of the eyewear device to sink heat to the frame and providing a second heat sink thermally connecting the processing devices to respective temples of the eyewear device to sink heat to the respective temples. The first and second heat sinks are thermally insulated from each other to direct the heat to different portions of the eyewear device. The processing devices may include a first co-processor disposed in a first temple connected to a first end of the frame and a second co-processor disposed in a second temple connected to a second end of the frame. The resulting eyewear device spreads the heat from heat generating devices over a larger area to minimize overall heating.
A personalized preview system to receive a request to access a collection of media items from a user of a user device. Responsive to receiving the request to access the collection of media items, the personalized preview system accesses user profile data associated with the user, wherein the user profile data includes an image. For example, the image may comprise a depiction of a face, wherein the face comprises a set of facial landmarks. Based on the image, the personalized preview system generates one or more media previews based on corresponding media templates and the image, and displays the one or more media previews within a presentation of the collection of media items at a client device of the user.
Systems and methods are provided for generating an augmented reality (AR) experience. The systems and methods receive a video depicting movement of a humanoid and a target image depicting an object. The systems and methods process, by a generative machine learning (ML) model, the video and the target image to generate a new video depicting the object performing the movement. The systems and methods generate the AR experience using the new video to overlay a face of a user on a portion of the new video.
Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for providing audio with captured video clips. The program and method provide for displaying, by a messaging application, a capture user interface for capturing video; providing a camera mode selection element which is selectable to switch between a first camera mode for capturing a single video clip and a second camera mode for capturing multiple video clips, to generate a media content item; providing an audio selection element which is selectable to select an audio track for the media content item; receiving, via the camera mode selection element, first user input selecting the second camera mode; receiving, via the audio selection element, second user input selecting the audio track; and providing for capturing multiple video clips in association with the selected audio track for generating the media content item.
A head-wearable extended reality (XR) device includes an optical assembly. The optical assembly has a display and an optical element. The display is provided to display virtual content to a user of the XR device. The optical element is provided to direct the virtual content from the display along an optical path to an eye of the user. The optical element includes a first portion and a second portion. The first portion provides a first focus distance that corresponds to a first viewing zone of the display. The second portion provides a second focus distance that differs from the first focus distance and corresponds to a second viewing zone of the display.
An occlusion detection system to perform operations that include: capturing image data that depicts an environment at a client device, the environment including a target object at a position within the environment; causing display of a presentation of the environment at the client device, the presentation of the environment including a display of the target object at the position within the environment; detecting a first attribute of the display of the target object at the client device; performing a comparison of the first attribute of the display of the target object and a second attribute associated with the target object; and detecting an occlusion based on the comparison.
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
G06T 7/90 - Determination of colour characteristics
G06T 19/00 - Manipulating 3D models or images for computer graphics
G06V 10/40 - Extraction of image or video features
G06V 20/20 - ScenesScene-specific elements in augmented reality scenes
H04L 51/222 - Monitoring or handling of messages using geographical location information, e.g. messages transmitted or received in proximity of a certain spot or area
H04W 4/021 - Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
H04W 4/30 - Services specially adapted for particular environments, situations or purposes
The media preview system receives media content from one or more client devices, generates a preview of the media content, associates a coded image with the preview within a database associated with the media preview system, detects scans of the coded image from client devices, and causes display of the preview at the client devices in response to detecting the scan.
An antenna that is coupled to and integrated with a projector, such as a projector included with smart glasses including eyewear. The projector has a housing and includes optical components configured to display an image. At least one antenna is coupled to the projector, wherein the optical components operate and function as an antenna substrate. The optical components are nonmetallic such that the antenna generates a strong E-field. The antenna may be coupled to the projector housing, such as on the inside or the outside surface of the housing. Multiple antennas can be included to generate multiple resonances simultaneously in different frequency bands.
Disclosed are systems, methods, and non-transitory computer-readable media for continuous surface and depth estimation. A continuous surface and depth estimation system determines the depth and surface normal of physical objects by using stereo vision limited within a predetermined window.
Embodiments described herein include an expressive icon system to present an animated graphical icon, wherein the animated graphical icon is generated by capture facial tracking data at a client device. In some embodiments, the system may track and capture facial tracking data of a user via a camera associated with a client device (e.g., a front facing camera, or a paired camera), and process the facial tracking data to animate a graphical icon.
G06T 13/40 - 3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
G06F 3/04817 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
H04M 1/72427 - User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for supporting games or graphical animations
H04M 1/7243 - User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
H04M 1/72469 - User interfaces specially adapted for cordless or mobile telephones for operating the device by selecting functions from two or more displayed items, e.g. menus or icons
A multimodal video generation framework (MMVID) that benefits from text and images provided jointly or separately as input. Quantized representations of videos are utilized with a bidirectional transformer with multiple modalities as inputs to predict a discrete video representation. A new video token trained with self-learning and an improved mask-prediction algorithm for sampling video tokens is used to improve video quality and consistency. Text augmentation is utilized to improve the robustness of the textual representation and diversity of generated videos. The framework incorporates various visual modalities, such as segmentation masks, drawings, and partially occluded images. In addition, the MMVID extracts visual information as suggested by a textual prompt.
H04N 21/472 - End-user interface for requesting content, additional data or servicesEnd-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification or for manipulating displayed content
A system to provide users with a means for accessing media content directly, by performing operations that include: causing display of a media item within a graphical user interface at a client device, the graphical user interface including a set of graphical elements; receiving a selection of a graphical element from among the set of graphical elements within the graphical user interface; generating a reference to the media item based on the selection of the graphical element; encoding a matrix barcode with the reference to the media item; and generating a presentation of the media item that includes a display of the matrix barcode at a position within the media item.
A head-wearable extended reality (XR) device includes a display arrangement. The display arrangement has a display to display virtual content, and also has one or more optical elements to direct the virtual content along an optical path to an eye of a user of the XR device. The virtual content is presented in a virtual content field of view. The display arrangement further includes an adjustment mechanism to alter the optical path so as to adjust the virtual content field of view between at least two display modes.
A button-switch assembly provides a preloaded force design with an enhanced tactile feel while also providing a non-wobbly (stabilized) configuration and water/dust protection functions. Features of the button-switch assembly include excellent tactile feel through a stack up of a soft rubber layer of a deflection web and a hard PET film shim layer, a consistent pre-loaded push force through use of an angled deflection web, a button flange that minimizes rotation of the button while providing a consistent tactile feel even when the edge of the button is depressed, double sided sealing adhesive layers that seal off the opening in the housing for accepting the button to prevent water/dust from entering the opening, and gluing the button to the rubber deflection web in variable thicknesses to provide a stable tension force to minimize wobble of the button when depressed.
H01H 13/06 - Dustproof, splashproof, drip-proof, waterproof, or flameproof casings
H01H 13/705 - Switches having rectilinearly-movable operating part or parts adapted for pushing or pulling in one direction only, e.g. push-button switch having a plurality of operating members associated with different sets of contacts, e.g. keyboard with contacts carried by or formed from layers in a multilayer structure, e.g. membrane switches characterised by construction, mounting or arrangement of operating parts, e.g. push-buttons or keys
Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and a method for performing operations comprising: accessing, by a first application implemented on a client device, data collected from one or more entropy sources; causing a second application implemented on the client device to access the data collected from the one or more entropy sources; generating a shared cryptographic key using the data collected from one or more entropy sources; establishing a communication channel between the first application and the second application; and exchanging, over the communication channel between the first application and the second application, one or more messages that have been encrypted using the shared cryptographic key.
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Peripherals; Augmented reality glasses; Augmented reality headsets; Computer hardware with embedded operating system software; Computer hardware, peripherals and software for remotely accessing, capturing, transmitting and displaying pictures, video, audio and data; Downloadable software for setting up, configuring, and controlling wearable computer hardware and peripherals; Downloadable software for setting up, configuring, and controlling wearable computer hardware and peripheral devices in the field of augmented reality; Downloadable computer operating software for augmented reality; Downloadable mobile operating system software; Downloadable computer operating system software; Downloadable computer operating system for operating augmented reality devices; Downloadable computer software offering backend components for augmented reality software development including data storage and database integration, user authentication services, real-time capabilities, application programming interfaces (APIs), vector embeddings, and backward compatibility Providing temporary use of online non-downloadable middleware for providing an interface between augmented reality devices and operating systems; Providing temporary use of online non-downloadable software for providing an interface between augmented reality devices and operating systems; Providing temporary use of online non-downloadable software for providing an interface between computer peripheral devices and operating systems; Platform as a service (PAAS) featuring computer software offering backend components for software developers; Backend as a service (BAAS) services featuring a software platform that offers backend components, including data storage and database integration, user authentication services, real-time capabilities, application programming interfaces (APIs), vector embeddings, and backward compatibility
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Computer peripherals; Augmented reality glasses; Augmented reality headsets; Computer hardware, peripherals and recorded software for remotely accessing, capturing, transmitting and displaying pictures, video, audio and data; Software for setting up, configuring, and controlling wearable computer hardware and peripherals; Software for setting up, configuring, and controlling wearable computer hardware and peripheral devices in the field of augmented reality; Downloadable computer operating software for augmented reality; Downloadable mobile operating system software; Downloadable computer operating system software; Downloadable computer operating system for operating augmented reality devices; Downloadable communications software for connecting computer network users; Downloadable computer software for organizing and viewing digital images and photographs; Downloadable computer operating system software for virtual environments; Downloadable operating system programs; Recorded operating system programs; Recorded computer operating system software; Computer hardware with embedded operating system software; Downloadable computer software for use as an application programming interface (API); Downloadable computer software for creating digital animation and special effects of images; Downloadable communication software for providing access to the Internet; Downloadable computer search engine software; Downloadable software for browsing the internet and accessing search engines; Downloadable computer software for geographic mapping, location mapping, spatial mapping, and spatial computing Providing temporary use of online non-downloadable middleware for providing an interface between augmented reality devices and operating systems; Providing temporary use of online non-downloadable software for providing an interface between augmented reality devices and operating systems; Providing temporary use of online non-downloadable software for providing an interface between computer peripheral devices and operating systems; Non-downloadable software for setting up, configuring, and controlling wearable computer hardware and peripherals; Non-downloadable software for setting up, configuring, and controlling wearable computer hardware and peripheral devices in the field of augmented reality; Non-downloadable computer operating software for augmented reality; Non-downloadable mobile operating system software; Non-downloadable computer operating system software; Non-downloadable computer operating system for operating augmented reality devices; non-downloadable communications software for connecting computer network users; Non-downloadable computer software for organizing and viewing digital images and photographs; Non-downloadable computer operating system software for virtual environments; Non-downloadable operating system programs; Non-downloadable computer software for use as an application programming interface (API); Non-downloadable computer software for creating digital animation and special effects of images; Non-downloadable communication software for providing access to the Internet; Non-downloadable computer search engine software; Providing temporary use of non-downloadable cloud-based software for connecting, operating, and managing networked wearable computer peripherals in the internet of things (IoT); Design, maintenance, development and updating of computer software; Development, maintenance and updating of a telecommunication network search engine; Software as a service (SAAS) services featuring software for browsing the internet and accessing search engines; Provision of Internet search engines; Providing a website featuring a search engine for accessing online content and content sharing; Providing temporary use of on-line non-downloadable software for geographic mapping, location mapping, spatial mapping, and spatial computing
(1) Computer peripherals; Augmented reality glasses; Augmented reality headsets; Computer hardware, peripherals and recorded software for remotely accessing, capturing, transmitting and displaying pictures, video, audio and data; Software for setting up, configuring, and controlling wearable computer hardware and peripherals; Software for setting up, configuring, and controlling wearable computer hardware and peripheral devices in the field of augmented reality; Downloadable computer operating software for augmented reality; Downloadable mobile operating system software; Downloadable computer operating system software; Downloadable computer operating system for operating augmented reality devices; Downloadable communications software for connecting computer network users; Downloadable computer software for organizing and viewing digital images and photographs; Downloadable computer operating system software for virtual environments; Downloadable operating system programs; Recorded operating system programs; Recorded computer operating system software; Computer hardware with embedded operating system software; Downloadable computer software for use as an application programming interface (API); Downloadable computer software for creating digital animation and special effects of images; Downloadable communication software for providing access to the Internet; Downloadable computer search engine software; Downloadable software for browsing the internet and accessing search engines; Downloadable computer software for geographic mapping, location mapping, spatial mapping, and spatial computing (1) Providing temporary use of online non-downloadable middleware for providing an interface between augmented reality devices and operating systems; Providing temporary use of online non-downloadable software for providing an interface between augmented reality devices and operating systems; Providing temporary use of online non-downloadable software for providing an interface between computer peripheral devices and operating systems; Non-downloadable software for setting up, configuring, and controlling wearable computer hardware and peripherals; Non-downloadable software for setting up, configuring, and controlling wearable computer hardware and peripheral devices in the field of augmented reality; Non-downloadable computer operating software for augmented reality; Non-downloadable mobile operating system software; Non-downloadable computer operating system software; Non-downloadable computer operating system for operating augmented reality devices; non-downloadable communications software for connecting computer network users; Non-downloadable computer software for organizing and viewing digital images and photographs; Non-downloadable computer operating system software for virtual environments; Non-downloadable operating system programs; Non-downloadable computer software for use as an application programming interface (API); Non-downloadable computer software for creating digital animation and special effects of images; Non-downloadable communication software for providing access to the Internet; Non-downloadable computer search engine software; Providing temporary use of non-downloadable cloud-based software for connecting, operating, and managing networked wearable computer peripherals in the internet of things (IoT); Design, maintenance, development and updating of computer software; Development, maintenance and updating of a telecommunication network search engine; Software as a service (SAAS) services featuring software for browsing the internet and accessing search engines; Provision of Internet search engines; Providing a website featuring a search engine for accessing online content and content sharing; Providing temporary use of on-line non-downloadable software for geographic mapping, location mapping, spatial mapping, and spatial computing
09 - Scientific and electric apparatus and instruments
42 - Scientific, technological and industrial services, research and design
Goods & Services
Downloadable software; downloadable mobile applications;
computer programs and downloadable computer software using
artificial intelligence for natural language processing,
generation, understanding and analysis; downloadable
computer programs and downloadable computer software for
machine learning; downloadable computer programs and
downloadable computer software for image recognition and
generation; downloadable computer programs and downloadable
computer software using artificial intelligence for music
generation and suggestions; downloadable computer programs
and downloadable computer software for artificial
intelligence, namely, computer software for developing,
running and analyzing algorithms that are able to learn to
analyze, classify, and take actions in response to exposure
to data; downloadable computer software for machine-learning
based language and speech processing; downloadable software
using artificial intelligence for image recognition and
generation; downloadable software using artificial
intelligence for text recognition and generation;
downloadable computer software using artificial intelligence
for image and video generation, editing and retouching;
downloadable computer software using artificial intelligence
(AI) for the generation of text, images, photos, videos,
audio, and multimedia content; downloadable computer
software using artificial intelligence (AI) for connecting
consumers with targeted promotional advertisements;
downloadable computer software using artificial intelligence
(AI) for the generation of advertisements and promotional
materials; downloadable computer software using artificial
intelligence (AI) for creating and generating text. Providing online non-downloadable software; research and
development services; research and development services in
the field of artificial intelligence; providing on-line
non-downloadable software using artificial intelligence (AI)
for natural language processing, generation, understanding,
and analysis; providing on-line non-downloadable software
for machine learning; providing on-line non-downloadable
software for image recognition and generation; providing
on-line non-downloadable software for developing, running
and analyzing algorithms that are able to learn to analyze,
classify, and take actions in response to exposure to data;
software as a service (SaaS) services featuring software for
using language models; providing on-line non-downloadable
software for machine-learning based language and speech
processing; providing on-line non-downloadable software
using artificial intelligence (AI) for image recognition and
generation; providing on-line non-downloadable software
using artificial intelligence (AI) for text recognition and
generation; providing on-line non-downloadable software for
the generation of advertisements and promotional materials;
providing on-line non-downloadable software using artificial
intelligence (AI) for music generation and suggestions;
providing on-line non-downloadable software using artificial
intelligence (AI) for image and video generation, editing
and retouching; providing on-line non-downloadable software
using artificial intelligence (AI) for the generation of
text, images, photos, videos, audio, and multimedia content;
providing on-line non-downloadable software using artificial
intelligence (AI) for connecting consumers with promotional
advertisements; providing on-line non-downloadable software
using artificial intelligence (AI) for the generation of
advertisements and promotional materials; providing on-line
non-downloadable software using artificial intelligence (AI)
for creating and generating text.
23.
SINGLE IMAGE THREE-DIMENSIONAL HAIR RECONSTRUCTION
A system to enable 3D hair reconstruction and rendering from a single reference image which performs a multi-stage process that utilizes both a 3D implicit representation and a 2D parametric embedding space.
A candidate content item is identified for integration into a content collection. The candidate content item is associated with a first value. Using at least one machine learning model, a select value and a skip value are automatically generated for the candidate content item. The select value indicates a likelihood that the user will select the candidate content item, and the skip value indicates a likelihood that the user will bypass the candidate content item. A second value is generated for the candidate content item based on the first value, the select value, and the skip value. The candidate content item is automatically selected from a plurality of candidate content items based on the second value meeting at least one predetermined criterion. The selected candidate content item is then automatically integrated into the content collection, which is caused to be presented on a device of a user.
Eyewear devices that include two SoCs that share processing workload. Instead of using a single SoC located either on the left or right side of the eyewear devices, the two SoCs have different assigned responsibilities to operate different devices and perform different processes to balance workload. In one example, the eyewear device utilizes a first SoC to operate displays, and it performs three-dimensional graphics and compositing. A second SoC operates first and second color cameras, first and second computer vision (CV) cameras, an operating system (OS), CV algorithms, and visual odometry (VIO), and it performs hand gesture tracking of the user and provides depth from stereo images. This configuration provides organized logistics to efficiently operate various features, and balanced power consumption.
A head-wearable extended reality (XR) device includes an optical assembly. The optical assembly has a display and an optical element. The display is provided to display virtual content to a user of the XR device. The optical element is provided to direct the virtual content from the display along an optical path to an eye of the user. The optical element includes a first portion and a second portion. The first portion provides a first focus distance that corresponds to a first viewing zone of the display. The second portion provides a second focus distance that differs from the first focus distance and corresponds to a second viewing zone of the display.
A head-wearable extended reality (XR) device includes a display arrangement. The display arrangement has a display to display virtual content, and also has one or more optical elements to direct the virtual content along an optical path to an eye of a user of the XR device. The virtual content is presented in a virtual content field of view. The display arrangement further includes an adjustment mechanism to alter the optical path so as to adjust the virtual content field of view between at least two display modes.
H04N 13/344 - Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays
H04N 13/361 - Reproducing mixed stereoscopic imagesReproducing mixed monoscopic and stereoscopic images, e.g. a stereoscopic image overlay window on a monoscopic image background
H04N 13/383 - Image reproducers using viewer tracking for tracking with gaze detection, i.e. detecting the lines of sight of the viewer's eyes
A three-dimensional (3D) scene is generated from non-aligned generic camera priors by producing a tri-plane representation for an input scene received in random latent code, obtaining a camera posterior including posterior parameters representing color and density data from the random latent code and from generic camera priors without alignment assumptions, and volumetrically rendering an image of the input scene from the color and density data to provide a scene having pixel colors and depth values from an arbitrary camera viewpoint. A depth adaptor processes depth values to generate an adapted depth map that bridges domains of rendered and estimated depth maps for the image of the input scene. The adapted depth map, color data, and scene geometry information from an external dataset are provided to a discriminator for selection of a 3D representation of the input scene.
The subject technology receives a set of frames. The subject technology detect a first gesture correspond to an open trigger finger gesture. The subject technology receives a second set of frames. The subject technology detects from the second set of frames, a second gesture correspond to a closed trigger finger gesture. The subject technology detects a location and a position of a representation of a finger from the closed trigger finger gesture. The subject technology generates a first virtual object based at least in part on the location and the position of the representation of the finger. The subject technology renders a movement of the first virtual object along a vector away from the location and the position of the representation of the finger within a first scene. The subject technology provides for display the rendered movement of the first virtual object along the vector within the first scene.
Various embodiments provide systems, methods, devices, and instructions for protected data use in a third-party software application, where use can be enabled while maintaining protection of the protected data from the third party software application. In particular, various embodiments provide a software application architecture that permits a data party that owns or maintains protected data to support a software development ecosystem where a third-party can develop a third-party software application that uses the protected data while denying the third-party access to the protected data.
G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules
G06F 21/53 - Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity, buffer overflow or preventing unwanted data erasure by executing in a restricted environment, e.g. sandbox or secure virtual machine
A support arm assembly for a head-worn device provides radio frequency (RF) shielding for a projector. A metal support arm, configured to structurally attach to a rear structural element and an optical element holder of the head-worn device, forms a rear face, a bottom face, and a top face of an enclosure. A metal front face of the enclosure attaches to the optical element holder, and defines a front aperture for permitting passage of light from an exit pupil of the projector toward an input optical element. The metal support arm forms a structural support joining the optical element holder to the rear structural element without placing mechanical load on the projector. A first side face of the enclosure and a second side face of the enclosure are electrically coupled to the metal support arm.
The present invention relates to improvements to systems and methods for determining a current location of a client device, and for identifying and selecting appropriate geo-fences based on the current location of the client device. An improved geo-fence selection system performs operations that include associating media content with a geo-fence that encompasses a portion of a geographic region, sampling location data from a client device, defining a boundary based on the sampled location data from the client device, detecting an overlap between the boundary and the geo-fence, retrieving the media content associated with the geo-fence, and loading the media content at a memory location of the client device, in response to detecting the overlap.
Systems, methods, and computer readable media for 3D content display using head-wearable apparatuses. Example methods include a head-wearable apparatus that is configured to determine a position for a content item on a closest curved line, of a plurality of curved lines, to the head-wearable apparatus that has space for the content item. The method includes adjusting a shape of the content item based on the position of the content item on the closest curved line and a user view of a user of the head-wearable apparatus. The method includes causing the adjusted content item to be displayed on a display of the head-wearable apparatus at the position on the closest curved line. The curved lines are either higher or lower as the curved lines goes away from the head-wearable apparatus. Additionally, the curved line or the content item may be adjusted with a random movement for an organic appearance.
Image augmentation effects are provided on a device that includes a display and a camera. A simplified augmented reality effect is applied to a stream of images captured by the camera, to generate a preview stream of images. The preview stream of images is displayed on the display. A second stream of images corresponding to the first stream of images is saved to an initial video file. A full augmented reality effect, corresponding to the simplified augmented reality affect, is then applied to the second stream of images to generate a fully-augmented stream of images, which are saved to a further video file. The further video file can then be played back on the display to show the final, fully augmented reality effect as applied to the stream of images.
Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for providing augmented reality-based makeup. The program and method provide for receiving a request to present augmented reality content in association with an image captured by a device camera, the image depicting a user's face; accessing an augmented reality content item applying makeup to the face, the augmented reality content configured to generate a mesh for tracking plural regions of the face and to present available makeup products with respect to the plural regions; presenting the augmented reality content item in association with the face depicted in the image; receiving user input selecting a region of the plural regions; determining a set of available makeup products corresponding to the selected region; and updating presentation of the augmented reality content item based on the set of available makeup products.
A method for aligning coordinate systems from separate augmented reality (AR) devices is described. In one aspect, the method includes generating predicted depths of a first point cloud by applying a pre-trained model to a first single image generated by a first monocular camera of a first augmented reality (AR) device, and first sparse 3D points generated by a first SLAM system at the first AR device, generating predicted depths of a second point cloud by applying the pre-trained model to a second single image generated by a second monocular camera of the second AR device, and second sparse 3D points generated by a second SLAM system at the second AR device, determining a relative pose between the first AR device and the second AR device by registering the first point cloud with the second point cloud.
Methods and systems are disclosed for performing operations comprising: receiving an image that includes a depiction of a person wearing a fashion item; generating a segmentation of the fashion item by the person depicted in the image; receiving voice input associated with the person depicted in the image; in response to receiving the voice input, generating one or more augmented reality elements representing the voice input; and applying the one or more augmented reality elements to the fashion item worn by the person based on the segmentation of the fashion item worn by the person.
A method for dynamically initializing a 3 degrees of freedom (3DOF) tracking device is described. In one aspect, the method includes accessing a gyroscope signal from a gyroscope of the 3DOF tracking device, accessing an accelerometer signal from an accelerometer of the 3DOF tracking device, determining an initial state includes a combination of an initial orientation, an initial position, and an initial velocity of the 3DOF tracking device, the initial state indicating a starting condition of the 3DOF tracking device, integrating the gyroscope signal and the accelerometer signal to obtain orientation and position signals using the initial state, and refining an inclination signal of the orientation signal using the position signal.
G06F 3/0346 - Pointing devices displaced or positioned by the userAccessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
G06F 3/038 - Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
A system for deformation or bending correction in an Augmented Reality (AR) system. Sensors are positioned in a frame of a head-worn AR system to sense forces or pressure acting on the frame by temple pieces attached to the frame. The sensed forces or pressure are used in conjunction with a model of the frame to determine a corrected model of the frame. The corrected model is used to correct video data captured by the AR system and to correct a video virtual overlay that is provided to a user wearing the head-worn AR system.
Methods and systems are disclosed for performing operations comprising: receiving an image that includes a depiction of a person wearing a fashion item; generating a segmentation of the fashion item worn by the person depicted in the image; identifying a facial expression of the user depicted in the image; and in response to identifying the facial expression, applying one or more augmented reality elements to the fashion item worn by the person based on the segmentation of the fashion item worn by the person.
A messaging system performs engagement analysis based on labels associated with content items produced by users of the messaging system. The messaging system is configured to process content items comprising images to identify elements in the images and determine labels for the images based on conditions indicating when to associate a label of the labels with an image of the images based on the elements in the image. The messaging system is further configured to associate the label with the content item, in response to determining to associate the label with the image, associating the label with the content item. The messaging system is further configured to determine engagement scores for the label based on interactions of users with the content items associated with label and adjust the engagement scores to determine trends in the labels to generate adjusted engagement scores.
The subject technology applies a three-dimensional (3D) effect to image data and depth data based at least in part on an augmented reality content generator. The subject technology generates a segmentation mask based at least on the image data. The subject technology performs background inpainting and blurring of the image data using at least the segmentation mask to generate background inpainted image data. The subject technology generates a packed depth map based at least in part on the a depth map of the depth data. The subject technology generates, using the processor, a message including information related to the applied 3D effect, the image data, and the depth data.
G06T 19/00 - Manipulating 3D models or images for computer graphics
G06F 3/04842 - Selection of displayed objects or displayed text elements
G06F 3/04883 - Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
A system and method for presentation of computer vision (e.g., augmented reality, virtual reality) using user data and a user code is disclosed. A client device can detect an image feature (e.g., scannable code) in one or more images. The image feature is determined to be linked to a user account. User data from the user account can then be used to generate one or more augmented reality display elements that can be anchored to the image feature in the one or more images.
G06T 11/60 - Editing figures and textCombining figures or text
A63F 13/00 - Video games, i.e. games using an electronically generated display having two or more dimensions
A63F 13/213 - Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
A63F 13/352 - Details of game servers involving special game server arrangements, e.g. regional servers connected to a national server or a plurality of servers managing partitions of the game world
A63F 13/58 - Controlling game characters or game objects based on the game progress by computing conditions of game characters, e.g. stamina, strength, motivation or energy level
A63F 13/65 - Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor automatically by game devices or servers from real world data, e.g. measurement in live racing competition
A63F 13/79 - Game security or game management aspects involving player-related data, e.g. identities, accounts, preferences or play histories
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
G06F 3/04817 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
G06V 20/20 - ScenesScene-specific elements in augmented reality scenes
G06V 40/16 - Human faces, e.g. facial parts, sketches or expressions
H04L 51/52 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
Method for receiving an input onto a graphical user interface at a client device, capturing an image frame at the client device, the image frame comprising a depiction of an object, identifying the object within the image frame, accessing media content associated with the object within a media repository in response to identifying the object, and causing presentation of the media content within the image frame at the client device.
G06F 16/40 - Information retrievalDatabase structures thereforFile system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
G06F 3/0488 - Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
G06V 10/75 - Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video featuresCoarse-fine approaches, e.g. multi-scale approachesImage or video pattern matchingProximity measures in feature spaces using context analysisSelection of dictionaries
G06V 20/20 - ScenesScene-specific elements in augmented reality scenes
An eyewear device that accurately and dynamically adjusts color and brightness of a see-through display as a function of a user's eye gaze direction and eye position using a display characteristic map. The display characteristic map is indicative of display characteristics of the see-through display. Color masks are generated as a function of the display characteristic map and the user's eye gaze direction and eye position, and a processor adjusts the see-through display characteristics based on the color masks.
Systems and methods are presented that provide for receiving, at a media overlay publication system from a first client device, content to generate a media overlay, and generating the media overlay using the content received from the client device. The generated media overlay is stored in a database associated with the media overlay publication system and associated with a first characteristic of the content received from the first client device. The media overlay is provided to a second client device when a second characteristic of context data associated with the second client device correlates to the first characteristic for the media overlay, causing a display of the media overlay on a user interface of the second client device.
G06T 11/60 - Editing figures and textCombining figures or text
G06F 3/04842 - Selection of displayed objects or displayed text elements
G06F 3/04845 - Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
H04L 67/52 - Network services specially adapted for the location of the user terminal
H04N 21/431 - Generation of visual interfacesContent or additional data rendering
H04N 21/45 - Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies or resolving scheduling conflicts
H04N 21/462 - Content or additional data management e.g. creating a master electronic program guide from data received from the Internet and a Head-end or controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
A method for secure virtual currency transactions between applications operating in different security domains. A first application in a first security domain receives a request from a second application in a second security domain to access a virtual currency store, where the first security domain restricts the second application from accessing user data. The first application accesses user account data containing a virtual currency balance within its secure domain and displays a virtual currency interface with multiple virtual items. Upon receiving a user's selection of a virtual item, a purchase is initiated using the virtual currency balance, the balance is adjusted accordingly, and a purchase notification is transmitted to the second application while maintaining security restrictions on user data access.
G06F 21/62 - Protecting access to data via a platform, e.g. using keys or access control rules
G06F 3/048 - Interaction techniques based on graphical user interfaces [GUI]
G06F 9/451 - Execution arrangements for user interfaces
G06F 16/70 - Information retrievalDatabase structures thereforFile system structures therefor of video data
G06Q 30/02 - MarketingPrice estimation or determinationFundraising
H04L 51/04 - Real-time or near real-time messaging, e.g. instant messaging [IM]
H04L 51/046 - Interoperability with other network applications or services
H04L 51/52 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
H04L 67/02 - Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
H04L 67/53 - Network services using third party service providers
H04M 1/72436 - User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for text messaging, e.g. short messaging services [SMS] or e-mails
H04N 1/00 - Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmissionDetails thereof
49.
GENERATING USER INTERFACES DISPLAYING AUGMENTED REALITY GRAPHICS
An Augmented Reality (AR) graphics system is provided. The AR graphics system may coordinate the display of augmented reality graphics created by multiple users located in an environment. The AR graphics system may determine an alignment object located in the environment that is designated as a common origin of a real-world coordinate system that is used to determine where to display AR graphics within the environment. Additionally, a prioritization scheme is implemented to resolve conflicts between overlapping input provided by different users in order to generate a single version of AR graphics.
G06T 19/00 - Manipulating 3D models or images for computer graphics
G06F 3/04815 - Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
An Augmented Reality (AR) system provides stabilization of hand-tracking input data. The AR system provides for display a user interface of an AR application. The AR system captures, using one or more cameras of the AR system, video frame tracking data of a gesture being made by a user while the user interacts with the AR user interface. The AR system generates skeletal 3D model data of a hand of the user based on the video frame tracking data that includes one or more skeletal 3D model features corresponding to recognized visual landmarks of portions of the hand of the user. The AR system generates targeting data based on the skeletal 3D model data where the targeting data identifies a virtual 3D object of the AR user interface. The AR system filters the targeting data using a targeting filter component and provides the filtered targeting data to the AR application.
In some examples, a method to present an affordance user interface element within a user interface of an interaction application includes detecting an association of a supplemental media content item with a primary media content item presented within the user interface. The supplemental media content item is identified from among a plurality of supplemental media content items supported by the interaction application. The method may include retrieving metadata related to the supplemental media content item and presenting, within the user interface, a supplementation affordance that presents the metadata. In some examples, the supplementation affordance is user selectable via the user interface to invoke a supplementation function that enables a user to apply the supplemental media content item to a further primary media content item. The supplementation function is invoked responsive to detecting a user selection of the supplementation affordance within the user interface.
G06F 3/0481 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
G06F 3/0484 - Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
Systems, methods, and computer-readable media for adding beauty products to tutorials are presented. Methods include accessing video data comprising images of a presenter creating a tutorial, the tutorial depicting the presenter applying a beauty product to a body part of the presenter. Methods further include processing the video data to identify changes to the body part of the presenter from an application of the beauty product, and responding to identifying changes to the body part of the presenter from the application of the beauty product by processing the video data to identify the beauty product. Methods further include retrieving information regarding the beauty product and causing presentation of information regarding the beauty product on a display device.
G06K 7/10 - Methods or arrangements for sensing record carriers by electromagnetic radiation, e.g. optical sensingMethods or arrangements for sensing record carriers by corpuscular radiation
System, methods, devices, and instructions are described for fast boot of a processor as part of camera operation. In some embodiments, in response to a camera input, a digital signal processor (DSP) of a device is booted using a first set of instructions. Capture of image sensor data is initiated using the first set of instructions at the DSP. The DSP then receives a second set of instructions and the DSP is programmed using the second set of instructions after at least a first frame of the image sensor data is stored in a memory of the device. The first frame of the image sensor data is processed using the DSP as programmed by the second set of instructions. In some embodiments, the first set of instructions includes only instructions for setting camera sensor values, and the second set of instructions includes instructions for processing raw sensor data into formatted image files.
An electronic eyewear device communicates with a backend service system via a device hub that provides an edge proxy server for a service request from the electronic eyewear device to the backend service system. The device hub provides a standardized request/response optimized schema for providing a standardized communication between the electronic eyewear device and the backend service system in response to the service request in a standardized format adapted to minimize network requests. A standardized communication is provided to at least one backend service of the backend service system, and a standardized response to the standardized service request is received from the backend service(s) and provided to the electronic eyewear device. In one configuration, the device hub may issue asynchronous requests to backend services in response to a service request from the electronic eyewear device and merge responses into a standardized response for the electronic eyewear device.
Optical devices and methods for expanding input light and outputting the expanded light include a waveguide and an input optical element to receive light incident on a first side of the waveguide. The input optical element includes an input reflective surface to reflect the received light into the waveguide. An intermediate diffractive optical element receives light in the waveguide from a first direction, and provides an expansion of the received light in a second direction perpendicular to the first direction. An output optical element includes an output reflective surface to reflect the expanded light out of the waveguide towards a viewer. The waveguide guides light along an optical path from the input optical element to the intermediate diffractive optical element and from the intermediate diffractive optical element to the output optical element.
Method starts with processor causing virtual reality (VR) interface for communication session to be displayed on first user interface of a first head-wearable apparatus and on second user interface of second head-wearable apparatus. Processor detects first touch input from first VR input device and second touch input from second VR input device. Processor monitors location of the first touch input within the first user interface and location of the second touch input within second user interface. Processor determines distance between location of the first touch input within first user interface and location on first user interface corresponding to location of second touch input within second user interface. Processor causes first and second VR input devices to generate haptic feedback response based on the distance. Haptic feedback response increases in intensity or speed as distance decreases and decreases in intensity or speed as distance increases. Other embodiments are described herein.
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
G06F 3/0488 - Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
H04L 65/1089 - In-session procedures by adding mediaIn-session procedures by removing media
Methods and systems are disclosed for generating a custom sticker. In one embodiment, a messaging application implemented on a first device receives a video and input that draws a selection of a region of the video. The messaging application generates a graphical element comprising the region of the video drawn by the input and applies one or more visual effects to the graphical element to create a custom graphic. The custom graphic with the one or more visual effects is sent from the first device to a second device.
Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for providing augmented reality-based makeup. The program and method provide for receiving, by a messaging application running on a device of a user, a request to present augmented reality content in association with an image captured by a device camera, the image depicting a face of the user; accessing an augmented reality content item configured to generate a plurality of completed looks with respect to applying makeup to the face; presenting the augmented reality content item, including the plurality of completed looks, in association with the face depicted in the image; receiving user input selecting a completed look of the plurality of completed looks; and displaying, in response to receiving the user input, an interface with a set of makeup products associated with the selected completed look.
A lift reporting system to perform operations that include: accessing user behavior data associated with one or more machine-learned (ML) models, the ML models associated with identifiers; determining causal conversions associated with the ML models based on the user behavior data, the causal conversions comprising values; performing a comparison between the values that represents the causal conversions; determining a ranking of the ML models based on the comparison; and causing display of a graphical user interface (GUI) that includes a display of identifiers associated with ML models.
A first extended reality (XR) device and a second XR device are colocated in an environment. The first XR device captures sensory data of a wearer of the second XR device. The sensory data is used to determine a time offset between a first clock of the first XR device and a second clock of the second XR device. The first clock and the second clock are synchronized based on the time offset and a shared coordinate system is established. The shared coordinate system enables alignment of virtual content that is simultaneously presented by the first XR device and the second XR device based on the synchronization of the first clock and the second clock.
Systems, methods, devices, instructions, and media are described for generating suggestions for connections between accounts in a social media system. One embodiment involves storing connection graph information for a plurality of user accounts, and identifying, by one or more processors of the device, a first set of connection suggestions based on a first set of suggestion metrics. A second set of connection suggestions is then identified based on a second set of suggestion metrics, wherein the second set of connection suggestions and the second set of suggestion metrics are configured to obscure the first set of connection suggestions, and a set of suggested connections is generated based on the first set of connection suggestions and the second set of connection suggestions. The set of connection suggestions is then communicated to a client device method associated with the first account.
G09G 5/395 - Arrangements specially adapted for transferring the contents of the bit-mapped memory to the screen
G09G 3/36 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix by control of light from an independent source using liquid crystals
A system and a method for generating an automated GIF file generation system is described. In one aspect, the method includes accessing an animated GIF file, identifying a plurality of elements displayed in the animated GIF file, applying a variation of one or more elements to the animated GIF file, and generating a variant animated GIF file by applying the variation of the one or more elements to the animated GIF file. The system measures a trending metric of the variant animated GIF file based on a number of times the variant animated GIF file is shared on the communication platform and uses the trending metric as a feedback to generating the variant animated GIF file.
An encapsulated waveguide system for a near eye optical display includes a first outer layer, a second outer layer, at least one waveguide substrate comprising an input area and an output area, a first spacer and a sealing element. The at least one waveguide substrate is disposed between the first and second outer layers and spaced therefrom by the first spacer. The sealing element joins edges of the first and second outer layers so as to encapsulate the at least one waveguide substrate within a cavity formed by the first and second outer layers. The formed cavity includes a first cavity between the at least one waveguide substrate and the first outer layer and a second cavity between the at least one waveguide substrate and the second outer layer.
Systems and methods for text and audio-based real-time face reenactment are provided. An example method includes receiving an input image including a body of a person, fitting a model to the body in the input image, where the model is configured to generate an output image including the body adopting a pose based on a set of pose parameters, generating, based on the input image and the model, a three-dimensional (3D) mesh of the body, generating a texture map for the 3D mesh, modifying the texture map to modify an appearance of at least a portion of the body, and generating, based on the modified texture map and the set of pose parameters, the output image of the body adopting the pose with the modified appearance.
Systems, computer readable medium and methods for autonomous drone stabilization and navigation are disclosed. Example methods include capturing an image using an image capturing device of the autonomous drone, processing the image to identify an object, and navigating the autonomous drone relative to the object to one or more waypoints. The autonomous drone navigates initially based on a relative location of the autonomous drone from the object. The autonomous drone determines a distance from the object based on an estimated size of the object and a number of pixels of an image sensor the object occupies. The autonomous drone determines a height above a ground to assist in navigation. Additionally, the autonomous drone hovers to determine a windspeed.
Multiple users can simultaneously view and scroll content from a collection of content items working on separate user systems. Example methods include generating a group feed, determining, based on metadata associated with the first user and metadata associated with the second user, a first content item of the plurality of content items, and causing the first content item to be displayed on a first computing device and on a second computing device. The methods may further include accessing, from the first user or the second user, an indication of a reaction to the first content item, accessing, from the first user or the second user, an indication to scroll to a second content item of the plurality of content items, and determining, the second content item, based on the metadata of the first user, the metadata of the second user, and the reaction to the first content item.
G06Q 50/00 - Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
G06F 3/04817 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
An audio track with vocals is played back using a device with a display screen that displays a video feed from a camera. A location of a mouth depicted in the video feed is detected. A timestamp of playback of the audio track is compared to viseme-timestamp data for the audio track to identify a viseme corresponding to the timestamp of the audio playback a viseme is positioned at the detected location of the mouth in the video feed.
The present disclosure relates to systems and methods for enhancing the interaction between users and automated agents, such as digital assistants, by employing Large Language Models (LLMs) to infer the intent of spoken language. The invention involves continuously monitoring ambient audio, converting speech to text, and utilizing LLMs to determine whether spoken language is intended for the automated agent. A structured prompt, including the converted text and specific instructions, is sent to the LLM, which is fine-tuned to process domain-specific prompts. The LLM provides a structured output in a standardized format, indicating the user's intent. The system may involve multiple prompts to perform separate tasks, such as identifying intent and generating additional context-specific data. This approach facilitates a more natural and intuitive user experience by eliminating the need for wake words and allowing seamless conversational interaction with virtual assistants across various platforms and devices.
G10L 15/18 - Speech classification or search using natural language modelling
B60H 1/00 - Heating, cooling or ventilating devices
B60R 16/037 - Electric or fluid circuits specially adapted for vehicles and not otherwise provided forArrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric for occupant comfort
G10L 15/183 - Speech classification or search using natural language modelling using context dependencies, e.g. language models
G10L 15/22 - Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 25/87 - Detection of discrete points within a voice signal
Multiple users can simultaneously view and scroll content from a collection of content items working on separate user systems. Example methods include generating a group feed, determining, based on metadata associated with the first user and metadata associated with the second user, a first content item of the plurality of content items, and causing the first content item to be displayed on a first computing device and on a second computing device. The methods may further include accessing, from the first user or the second user, an indication of a reaction to the first content item, accessing, from the first user or the second user, an indication to scroll to a second content item of the plurality of content items, and determining, the second content item, based on the metadata of the first user, the metadata of the second user, and the reaction to the first content item.
G06F 3/0482 - Interaction with lists of selectable items, e.g. menus
G06F 3/0484 - Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
G06Q 50/00 - Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
The present disclosure relates to systems and methods for enhancing the interaction between users and automated agents, such as digital assistants, by employing Large Language Models (LLMs) to infer the intent of spoken language. The invention involves continuously monitoring ambient audio, converting speech to text, and utilizing LLMs to determine whether spoken language is intended for the automated agent. A structured prompt, including the converted text and specific instructions, is sent to the LLM, which is fine-tuned to process domain-specific prompts. The LLM provides a structured output in a. standardized format, indicating the user's intent. The system may involve multiple prompts to perform separate tasks, such as identifying intent and generating additional context-specific data. This approach facilitates a more natural and intuitive user experience by eliminating the need for wake words and allowing seamless conversational interaction with virtual assistants across various platforms and devices.
Users of a chat system within an interactive platform can suspend the expiration of a plurality of content items. Example methods include generating a chat, the chat comprising an association between a first user account and a second user account, and receiving, from a first user system associated with the first user account, an indication of a plurality of content items and an indication of the chat. The method may further include sending, to a second user account, the plurality of content items and an indication of the chat, and receiving, from the second user account, an indication to save the plurality of content items within the chat. The method may further include setting a saved data field associated with the plurality of content items and the second user account, the saved data field indicating the plurality of content items do not expire within the chat.
Systems, devices, media, and methods are presented for an immersive augmented reality (AR) experience using an eyewear device. A portable eyewear device includes a processor, a memory, and a display projected onto at least one lens assembly. The memory has programming stored therein that, when executed by the processor, captures information depicting an environment surrounding the device and identifies a match between objects in that information and predetermined objects in previously obtained information for the same environment. When the position of the eyewear device reaches a preselected location with respect to the matched objects, a physical output is provided to produce the immersive experience. The physical output changes as the position of the eyewear device moves to maintain the immersive experience.
A messaging server system receives a message creation input from a first client device that is associated with a first user registered with the messaging server system. The messaging server system determines, based on an entity graph representing connections between a plurality of users registered with the messaging server system, that the first user is within a threshold degree of connection with a second that initiated a group story in relation to a specified event. The messaging server system determines, based on location data received from the first client device, that the first client device was located within a geo-fence surrounding a geographic location of the specified event during a predetermined event window, the geo-fence and event window having been designated by the second user, and causes the first client device to present a user interface element that enables the first user to submit content to the group story.
H04L 51/52 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
H04L 51/222 - Monitoring or handling of messages using geographical location information, e.g. messages transmitted or received in proximity of a certain spot or area
H04M 1/72436 - User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for text messaging, e.g. short messaging services [SMS] or e-mails
H04W 4/021 - Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
Systems, computer readable medium and methods for fully autonomous drone flight are disclosed. Example methods include taking off, navigating in accordance with a flight plan, and navigating the autonomous drone to land. The autonomous drone performs flight plans with only an initial command for the autonomous drone to fly and, in some examples, an indication of a landing space such as an open hand presented under the autonomous drone. After an initial fly command, the autonomous drone is not controlled by a remote-control device and does not receive any additional commands to complete the flight plan. The autonomous drone enters a lower energy state while flying where the wireless connections are turned off since the autonomous drone does not respond to commands during flight.
Systems and methods for text and audio-based real-time face reenactment are provided. An example method includes receiving an input image including a body of a person, fitting a model to the body in the input image, generating a warped depth map and a warped normal map corresponding to the body in the input image, generating, based on the warped depth map and the warped normal map, a point cloud representing a surface of the body, generating, by traversing the point cloud, a first mesh for a front side surface of the body and a second mesh for a back side surface of the body, and merging the first mesh and the second mesh into a reconstructed three-dimensional mesh of the body.
In a camera-enabled electronic device, photo capture is triggered by a press-and-hold input only if the holding duration of the press-and-hold input is greater than a predefined threshold duration. A press-and-hold input shorter in duration than the threshold triggers video capture. Thus, a short press triggers video capture, while a long press triggers photo capture.
H04N 23/667 - Camera operation mode switching, e.g. between still and video, sport and normal or high and low resolution modes
G08B 5/36 - Visible signalling systems, e.g. personal calling systems, remote indication of seats occupied using electric transmissionVisible signalling systems, e.g. personal calling systems, remote indication of seats occupied using electromagnetic transmission using visible light sources
An audio track with vocals is played back using a device with a display screen that displays a video feed from a camera. A location of a mouth depicted in the video feed is detected. A timestamp of playback of the audio track is compared to vi seme-timestamp datafor the audio track to identify a viseme corresponding to the timestamp of the audio playback, and a viseme is positioned at the detected location of the mouth in the video feed.
G11B 27/031 - Electronic editing of digitised analogue information signals, e.g. audio or video signals
G11B 27/11 - IndexingAddressingTiming or synchronisingMeasuring tape travel by using information not detectable on the record carrier
G11B 27/10 - IndexingAddressingTiming or synchronisingMeasuring tape travel
G11B 27/28 - IndexingAddressingTiming or synchronisingMeasuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
H04N 21/43 - Processing of content or additional data, e.g. demultiplexing additional data from a digital video streamElementary client operations, e.g. monitoring of home network or synchronizing decoder's clockClient middleware
H04N 21/8547 - Content authoring involving timestamps for synchronizing content
80.
Display screen or portion thereof with a graphical user interface
Loading and unloading of ML models into an ML model cache or system memory of an electronic eyewear device is managed based on which applications are active or available and predicted activities. Sensor inputs are processed to detect whether the electronic eyewear device has moved or is predicted to move and new ML models are downloaded based on updated location information or observable visual information. Sensor inputs are also processed to determine whether the electronic eyewear device has changed state or resource availability and whether the ML model cache or system memory needs to be resized to accommodate new ML models for the changed conditions. If so, stored ML models are updated to reflect the new device state by unloading an ML model, receiving a new ML model based on the changed state or resource availability and a processing priority of the new ML model, or both.
Systems, devices, media, and methods are presented for selectively activating and suspending control of a graphical user interface by two or more electronic devices. A portable eyewear device includes a display projected onto at least one lens assembly and a primary touchpad through which the user may access a graphical user interface (GUI) on the display. A handheld accessory device, such as a ring, includes an auxiliary touchpad that is configured to emulate the primary touchpad. The eyewear processor temporarily suspends inputs from one touchpad when it detects an activation signal from the other touchpad.
G06F 3/0354 - Pointing devices displaced or positioned by the userAccessories therefor with detection of 2D relative movements between the device, or an operating part thereof, and a plane or surface, e.g. 2D mice, trackballs, pens or pucks
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
G06F 3/0346 - Pointing devices displaced or positioned by the userAccessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
G06F 3/038 - Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
G06F 3/04883 - Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
An optical waveguide is disclosed. The optical waveguide is to provide pupil expansion in two dimensions with input and output ends and having a first axis substantially parallel to the direction of propagation of light in the waveguide and substantially parallel with a direction from the input end to the output end. The optical waveguide includes an input region; a beam splitter to expand light received from the input region; and a symmetrical diffraction grating comprising complementary first and second grating portions. The second grating portion is substantially symmetrical to the first grating portion along a line of symmetry that is substantially parallel to the first axis. Light received at the diffraction grating from the beam splitter is to be diffracted by the grating towards the line of symmetry by the first or second grating portion.
Examples described herein relate to hand-based light estimation for extended reality (XR). An image sensor of an XR device is used to obtain an image of a hand in a real-world environment. At least part of the image is processed to detect a pose of the hand. One of a plurality of machine learning models is selected based on the detected pose. At least part of the image is processed via the machine learning model to obtain estimated illumination parameter values associated with the hand. The estimated illumination parameter values are used to render virtual content to be presented by the XR device.
Examples described herein relate to stateful inference of a neural network. A plurality of feature map segments each has a first set of values stored in a compressed manner. The first sets of values at least partially represent an extrinsic state memory of the neural network after processing of a previous input frame. Operations are performed with respect to each feature map segment. The operations include decompressing and storing the first set of values. The operations further include updating at least a subset of the decompressed first set of values based on a current input frame to obtain a second set of values. The second set of values is compressed and stored. Memory resources used to store the decompressed first set of values is released. The second sets of values at least partially represent the extrinsic state memory of the neural network after processing of the current input frame.
Examples described herein relate to hand-based light estimation for extended reality (XR). An image sensor of an XR device is used to obtain an image of a hand in a real-world environment. At least part of the image is processed to detect a pose of the hand. One of a plurality of machine learning models is selected based on the detected pose. At least part of the image is processed via the machine learning model to obtain estimated illumination parameter values associated with the hand. The estimated illumination parameter values are used to render virtual content to be presented by the XR device.
Visual-inertial tracking of an eyewear device using a rolling shutter camera(s). The device includes a position determining system. Visual-inertial tracking is implemented by sensing motion of the device. An initial pose is obtained for a rolling shutter camera and an image of an environment is captured. The image includes feature points captured at a particular capture time. A number of poses for the rolling shutter camera is computed based on the initial pose and sensed movement of the device. The number of computed poses is responsive to the sensed movement of the mobile device. A computed pose is selected for each feature point in the image by matching the particular capture time for the feature point to the particular computed time for the computed pose. The position of the mobile device is determined within the environment using the feature points and the selected computed poses for the feature points.
Systems and methods described herein provide for retrieving, from a storage device, first image data previously captured by a client device. The systems and methods further detect a selection of a first image processing operation and perform the first image processing operation on the first image data to generate second image data. The systems and methods further detect a selection of a second image processing operation and perform the second image processing operation on the second image data to generate third image data. The systems and methods generate a message comprising the third image data.
H04L 51/52 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
90.
AUTOMATED ADJUSTMENT OF DIGITAL CAMERA IMAGE CAPTURE PARAMETERS
A portable electronic device with image capturing capabilities automatically or semi-automatically adjusts one or more image capturing parameters based on an input attribute of user engagement with a single-action haptic input mechanism. For example, the duration for which a single-action control button carried on a frame of the device is pressed automatically determines an image stabilization mode for on-board processing of captured image data. In one example, an above-threshold press duration automatically activates a less rigorous image stabilization mode, while button release before expiry of the threshold automatically activates a more rigorous photo stabilization mode.
Methods and systems are disclosed for training a machine learning (ML) model to detect inner speech. The system collects, by an electromyograph (EMG) communication device used by a user, a first set of EMG signals over a first time interval. The system generates a first plurality of features based on the first set of EMG signals and generates a first probability associated with presence of inner speech by processing the first plurality of features with a machine learning (ML) model. The system compares the first probability generated by the ML model to a specified threshold and detects presence of the inner speech of the user in response to determining that the first probability generated by the ML model transgresses the specified threshold.
Systems, devices, media, and methods are presented for using a flexible electronic device to selectively interact with an eyewear device. A portable eyewear device includes a processor, a memory, and a display projected onto at least one lens assembly. A flexible electronic device includes an integrated circuit, a plurality of input sensors, and a power system, all mounted on a flexible substrate that is sized and shaped to conform to a graspable object such as a ring. The flexible electronic device operates according to a power budget, operating on a sensor power budget until it detects a first interaction with at least one of the input sensors. If the first interaction exceeds a sensitivity threshold, the flexible electronic device sends a wake signal to a nearby eyewear device. In response to the wake signal, the eyewear device presents a graphical user interface (GUI) on the display. The eyewear device further presents a cursor along a path on the display that is substantially correlated to the course traveled by the flexible electronic device in motion along a course.
Example systems, devices, media, and methods are described for presenting an interactive game in augmented reality on the display of a smart eyewear device. A hand tracking utility detects and tracks the location of hand gestures in real time, based on high-definition video data. The detected hand gestures are compared to a library of hand gestures and landmarks. Examples include synchronized, multi-player games in which each device detects and shares hand gestures with other devices for evaluation and scoring. A single-player example includes gesture-shaped icons presented on a virtual scroll that appears to move toward an apparent collision with corresponding key images, awarding points if the player's hand is located near the apparent collision and the detected hand shape matches the moving icon.
A63F 13/428 - Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment by mapping the input signals into game commands, e.g. mapping the displacement of a stylus on a touch screen to the steering angle of a virtual vehicle involving motion or position input signals, e.g. signals representing the rotation of an input controller or a player's arm motions sensed by accelerometers or gyroscopes
A63F 13/213 - Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
A63F 13/26 - Output arrangements for video game devices having at least one additional display device, e.g. on the game controller or outside a game booth
A63F 13/537 - Controlling the output signals based on the game progress involving additional visual information provided to the game scene, e.g. by overlay to simulate a head-up display [HUD] or displaying a laser sight in a shooting game using indicators, e.g. showing the condition of a game character on screen
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
G06F 3/042 - Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means by opto-electronic means
G06F 3/04817 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
G06T 11/60 - Editing figures and textCombining figures or text
94.
SIGN LANGUAGE INTERPRETATION WITH COLLABORATIVE AGENTS
A method for recognizing sign language using collaborative augmented reality devices is described. In one aspect, a method includes accessing a first image generated by a first augmented reality device and a second image generated by a second augmented reality device, the first image and the second image depicting a hand gesture of a user of the first augmented reality device, synchronizing the first augmented reality device with the second augmented reality device, in response to the synchronizing, distributing one or more processes of a sign language recognition system between the first and second augmented reality devices, collecting results from the one or more processes from the first and second augmented reality devices, and displaying, in near real-time in a first display of the first augmented reality device, text indicating a sign language translation of the hand gesture based on the results.
G06F 40/58 - Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
G06F 3/0346 - Pointing devices displaced or positioned by the userAccessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
G06V 10/26 - Segmentation of patterns in the image fieldCutting or merging of image elements to establish the pattern region, e.g. clustering-based techniquesDetection of occlusion
G06V 20/20 - ScenesScene-specific elements in augmented reality scenes
G06V 40/10 - Human or animal bodies, e.g. vehicle occupants or pedestriansBody parts, e.g. hands
G06V 40/20 - Movements or behaviour, e.g. gesture recognition
95.
GENERATING THREE-DIMENSIONAL OBJECT MODELS FROM TWO-DIMENSIONAL IMAGES
This specification discloses methods and systems for generating three-dimensional models of deformable objects from two-dimensional images. According to one aspect of this disclosure, there is described a computer implemented method for generating a three dimensional model of deformable object from a two-dimensional image. The method comprises: receiving, as input to an embedding neural network, the two-dimensional image, wherein the two dimensional image comprises an image of an object; generating, using the embedding neural network, an embedded representation of a two-dimensional image; inputting the embedded representation into a learned decoder model; and generating, using the learned decoder model, parameters of the three dimensional model of the object from the embedded representation.
Methods and systems are disclosed for performing generating AR experiences on a messaging platform. The methods and systems receive, from a client device, a request to access an augmented reality (AR) experience and access a list of event types associated with the AR experience used to generate one or more metrics. The methods and systems determine that an interaction associated with the AR experience corresponds to a first event type of the list of event types and generates interaction data for the first event type representing the interaction. In response to receiving a request to terminate the AR experience, the systems and methods transmit the interaction data to a remote server.
Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing at least one program and a method for rendering virtual modifications to real-world environments depicted in image content. A reference surface is detected in a three-dimensional (3D) space captured within a camera feed produced by a camera of a computing device. An image mask is applied to the reference surface within the 3D space captured within the camera feed. A visual effect is applied to the image mask corresponding to the reference surface in the 3D space. The application of the visual effect to the image mask causes a modified surface to be rendered in presenting the camera feed on a display of the computing device.
An avatar notification system is disclosed, which performs operations that include: causing display of a notification at a client device associated with a first user account, the notification including an identification of a second user account; receiving an input that selects the notification from the client device; presenting a composition interface at the client device in response to the input that selects the notification, the composition interface including a display of a media element that comprises a first identifier associated with the first user account and a second identifier associated with the second user account; receiving a selection of the media element from the client device; and generating a message that includes the media element in response to the selection.
G06F 3/04817 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
G06F 3/0482 - Interaction with lists of selectable items, e.g. menus
H04L 51/216 - Handling conversation history, e.g. grouping of messages in sessions or threads
H04L 51/224 - Monitoring or handling of messages providing notification on incoming messages, e.g. pushed notifications of received messages
H04L 51/52 - User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
Aspects of the present disclosure involve a system comprising a computer-readable storage medium storing a program and method for providing an draggable shutter button during video recording. The program and method provide for displaying a user interface within an application running on a device, the user interface presenting real-time image data captured by a camera of the device, the user interface including a shutter button which is configured to be selectable by a user to initiate video recording in response to a first user gesture; and upon detecting the first user gesture selecting the shutter button, initiating video recording with respect to the real-time image data, and providing for the shutter button to be draggable in predefined directions to perform respective functions related to the video recording.
Systems and methods for text and audio-based real-time face reenactment are provided. An example method includes receiving a target video that includes a target face, receiving a source video that includes a source face, determining, based on a parametric face model, facial expression parameters of the source face, modifying, in real time, the target face to imitate a face expression of the source face based on the facial expression parameters to generate a sequence of modified video frames, and displaying at least part of the sequence of modified video frames on a computing device during the generation of at least one frame of the sequence of modified video frames.