The present disclosure relates to the technical field of artificial intelligence such as intelligent transportation and computer vision, and provides a method and apparatus for detecting failure to yield to pedestrians of a vehicle, an electronic device, and a readable storage medium. The method for detecting failure to yield to pedestrians of a vehicle comprises: acquiring a marked crosswalk area on the basis of surveillance images in a surveillance video stream; for each surveillance image frame in the surveillance video stream, acquiring the intersection over union between a target vehicle and the marked crosswalk area in the surveillance image frame, and selecting multiple surveillance image frames having intersection over union greater than a preset intersection over union threshold as target surveillance images corresponding to the target vehicle; and sequentially performing human detection on each target surveillance image frame in descending order of intersection over union, and when it is determined that a human detection result corresponding to a current target surveillance image satisfies a preset requirement, determining that the target vehicle is a regulation-violating vehicle which does not yield to pedestrians. The present disclosure can expand the detection scenario, reduce the computing resources required for detection, and improve the detection efficiency while detecting whether vehicles yield to pedestrians.
In the field of cloud networks and network security, which may be applied to intelligent cloud scenarios, a cloud network message processing method includes: obtaining a cloud network message; determining, from at least one type of pre-configured candidate security device, a target type of candidate security device corresponding to the cloud network message; in the case that there are multiple candidate security devices of the target type, determining a target security device from the multiple candidate security devices of the target type based on session information included in the cloud network message, where cloud network messages with same session information correspond to a same target security device; sending the cloud network message to the target security device for security processing, and sending the cloud network message having been security processed by the target security device to a destination.
The present disclosure relates to the technical fields of artificial intelligence, such as cloud services, big data and large language models, and provides an algorithm service deployment method and apparatus, an electronic device, and a readable storage medium. The algorithm service deployment method comprises: a first server acquiring service operation data and service information of a target algorithm service, wherein the service operation data comprises one of an application program interface, a mirror image file and a model file; determining a service access type on the basis of the service property of the target algorithm service, and acquiring a target access method corresponding to the service access type, wherein the service access type comprises one of application program interface access, mirror image access and model access; and deploying the target algorithm service at the first server on the basis of the target access method, the service operation data and the service information. According to the present disclosure, the first server supports deployment of algorithm services corresponding to different service access types, so that the first server can have stronger service deployment performance.
A motion estimation method, apparatus, electronic device, storage medium, and computer program product are disclosed, which relates to the field of artificial intelligence, specifically cloud storage, cloud computing, video encoding. A method for motion estimation comprises: determining candidate search spaces and candidate search starting points based on a lookahead motion vector and a predicted search starting point of a current block; determining a target search starting point from the candidate search starting points and determining a target search space from the candidate search spaces; performing a search based on the target search starting point and the target search space to obtain an initial motion estimation result for the current block; obtaining a target motion estimation result for the current block based on the initial motion estimation result.
The present disclosure relates to the technical field of artificial intelligence, in particular to the technical fields of computer vision, deep learning and the like, and provides a target identification method and device. The specific implementation comprises: obtaining a video stream collected by a rotatable camera device; determining a preset position of the camera device on the basis of a target in an image frame corresponding to the video stream; determining a still image frame and a fixed still object in the still image frame on the basis of the preset position and the video stream; and determining a position change result of the target in the video stream on the basis of the still image frame and the fixed still object. The implementation improves the accuracy of position change identification of targets.
Provided are a query processing method based on a large language model, an electronic device, and a storage medium. The query processing method based on a large language model includes acquiring a to-be-processed target query; generating a prompt based on a to-be-used target data model, target format information of a specified data format, and the target query; inputting the prompt into the large language model to obtain a target parsing result of the specified data format outputted by the large language model; and modifying the target parsing result based on the target data model.
A method of training a deep learning model and a method of synthesizing a speech are provided, which relate to a field of artificial intelligence technology, in particular to fields of large model, large language model, generative model, deep learning, and speech processing technologies. The method of training a deep learning model includes: determining a reference speech feature of a sample speech, the reference speech feature being associated with a prosodic feature of the sample speech; retrieving a speech library using a sample text corresponding to the sample speech, so as to obtain a pronunciation expression feature of the sample text; inputting the pronunciation expression feature into the deep learning model to obtain an output speech feature; determining a loss of the deep learning model according to the reference speech feature and the output speech feature; and adjusting a parameter of the deep learning model according to the loss.
The present disclosure relates to the technical field of artificial intelligence, and specifically to technical fields such as computer vision, deep learning and big data. Provided are a rainfall identification method and apparatus, a model training method and apparatus, and a device and a storage medium, which may be applied in scenarios such as smart cities and emergency management. The rainfall identification method comprises: processing a target video collected by a target camera, so as to obtain an initial rainfall identification result; determining a target rainfall amount station, the distance between which and the target camera meets a preset condition, and acquiring target rainfall amount data of the target rainfall amount station; and on the basis of the initial rainfall identification result and the target rainfall amount data, determining a target rainfall identification result. The present disclosure may improve the accuracy of rainfall identification.
Provided are a query processing method based on a large language model, a prompt construction method, an electronic device, and a storage medium. The query processing method includes acquiring a to-be-processed target query; acquiring a data field in a target data model and acquiring target format information of a specified data format; constructing a prompt based on the data field in the target data model, the target format information, and the target query; and inputting the prompt into the large language model to obtain a target format result outputted by the large language model.
A method for predicting a structure of a protein complex includes: obtaining an initial coordinate of each amino acid residue in a target protein complex, and obtaining a target residue pair feature, a first multiple sequence alignment (MSA) feature and a second MSA feature of each protein monomer in the target protein complex; and inputting the initial coordinate of each amino acid residue, and the target residue pair feature, the first MSA feature and the second MSA feature of each protein monomer into an N-level fold iteration network layer, and obtaining a target coordinate of each amino acid residue by predicting a torsion angle, a position transformation at residue level and a position transformation at monomer chain level of each amino acid residue via the N level fold iteration network layer, to obtain a predicted structure of the protein complex.
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
11.
HUMAN-COMPUTER METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
Provided are a human-computer interaction method and apparatus, an electronic device and a storage medium, relating to the technical field of artificial intelligence, in particular to the technical fields of deep learning, natural language processing and large models. The specific implementation solution comprises: in response to a human-computer interaction request and on the basis of a first dialogue text comprised in the human-computer interaction request, determining from among a plurality of plug-ins registered in a large language model a first target plug-in related to the first dialogue text; obtaining a second dialogue text on the basis of the first dialogue text and a description text of the first target plug-in; and inputting the second dialogue text into the large language model to obtain a reply text.
A method for reference frame selection, an apparatus for reference frame selection, an electronic device and a storage medium are provided, which relates to the field of data processing technology, in particular to the fields of video coding technology and unsupervised learning technology. The method includes: acquiring a current frame to be processed and determining attribute information of the current frame; selecting candidate reference frames from a reference frame set according to the attribute information; clustering the selected candidate reference frames to obtain at least one cluster; and selecting one candidate reference frame from each of the at least one cluster and adding the one selected candidate reference frame from each of the at least one cluster to a reference frame list associated with the current frame. The technical solution herein provided can improve the accuracy of reference frame selection.
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/172 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
13.
MODEL OPERATOR PROCESSING METHOD AND DEVICE, ELECTRONIC EQUIPMENT AND STORAGE MEDIUM
A method for processing a model operator includes: determining an operator set for model networking, wherein the operator set comprises a plurality of operators; determining a storage amount occupied by an output tensor of each operator in the operator set and a computation time period consumed in a forward computation of each operator in the operator set; and determining a first operator participating in recomputation in a model from the operator set, based on the storage amounts and the computation time periods of the plurality of operators.
A method is provided that includes: obtaining first urban data of a first sample urban region; inputting the first urban data into a multi-modal foundation model to obtain respective predicted vector representations of a plurality of first data segments; obtaining a plurality of general-purpose foundation models that are pre-trained; for each general-purpose foundation model: generating a vector representation label of a first data segment of a corresponding data modality by using the general-purpose foundation model; and determining a knowledge distillation loss of the general-purpose foundation model based on the vector representation label and a predicted vector representation of the first data segment; and adjusting parameters of the multi-modal foundation model based on at least respective knowledge distillation losses of the plurality of general-purpose foundation models.
The present disclosure relates to the technical fields of computers, and in particular to the technical fields of artificial intelligence (AI), neural network models, and smart city, and provides a target detection method based on a multi-task AI large model, and a model training method based on a multi-task AI large model. The specific implementation scheme of the target detection method comprises: recognizing a target object of an image under test to obtain a first recognition result; on the basis of the confidence level of the first recognition result and a first threshold corresponding to a first precision, determining a first alarm object from the first recognition result as a detection result; when a trigger condition is met, performing target detection on an image under supplementary test corresponding to the first recognition result to obtain a second recognition result; on the basis of the confidence level of the second recognition result and a second threshold corresponding to a second precision, determining a second alarm object from the second recognition result; and updating the detection result on the basis of the second alarm object. The present disclosure can ensure high precision of target detection and reduce the missing recall.
The present application relates to the technical field of computers, and particularly relates to the field of source codes. Provided are a method and apparatus for generating simulated data, and an electronic device and a storage medium. The specific implementation solution comprises: acquiring a database table of a project to be processed, and determining data simulation configuration information corresponding to a field in the database table, wherein the database table comprises at least one field, which represents the type of simulated data required by said project, and the data simulation configuration information represents a generation means for the simulated data and the format of the simulated data; and on the basis of the data simulation configuration information, generating the simulated data under the field in the database table. Different pieces of configuration information are set for different fields, and when it is necessary to simulate data, corresponding configuration information is searched for to automatically generate simulated data, thereby improving the simulation efficiency of the data.
Disclosed are a sentiment analysis method and apparatus, a large language model training method and apparatus, an electronic device, a storage medium, a computer program product and a computer program. The sentiment analysis method comprises: acquiring first target text; extracting from the first target text an object to be analyzed; generating second target text on the basis of the first target text and said object, wherein the second target text comprises task prompt text, and the task prompt text is used for prompting a large language model to execute a sentiment analysis task on said object on the basis of the first target text; and inputting the second target text into the large language model to obtain the sentiment polarity of said object.
The present application provides a large language model-based event processing method and apparatus, a device and a medium. The large language model-based event processing method comprises: acquiring a question input by a user; acquiring pre-generated event information; fusing the question and the event information to obtain a fusion result; and inputting the fusion result into a pre-trained large language model to obtain reply content corresponding to the question.
The present disclosure relates to the technical field of artificial intelligence, and specifically to the technical fields such as large models and natural language understanding, and provides a chart generation method and apparatus, a device, and a storage medium. The chart generation method comprises: acquiring target text content and target prompt information; on the basis of the target text content and the target prompt information, using a first pre-trained language model to generate structured information, wherein the structured information is used for generating a target chart; on the basis of the structured information, using a second pre-trained language model to generate the target chart; and displaying the target chart. The present disclosure can improve the chart generation efficiency and accuracy.
The present disclosure relates to the technical field of video processing, in particular to the technical field of monitoring video processing, and provides a monitoring video processing method, a monitoring video processing apparatus, an electronic device, a storage medium, and a program product. The specific implementation solution is: acquiring a monitoring video stream to be processed; performing semantic segmentation on video frames in said monitoring video stream to obtain semantic tags of the video frames; on the basis of the semantic tags of the video frames and scenario determination rules, determining service scenarios to which the video frames are applicable; and determining a scenario tag of said monitoring video stream on the basis of the service scenarios to which the plurality of video frames in said monitoring video stream are applicable.
The present disclosure relates to the technical field of code generation, and provides a code generation method and apparatus, a device and a storage medium. The method comprises: in response to receiving operation information of a user, requesting a node backend to create a corresponding project and a project file; in response to determining that the project and the project file are created, requesting a java backend to create a data source; calling the java backend to perform project engineering initialization and code assembly; and in response to determining that the code assembly is completed, performing project verification and code export.
Disclosed are a student model generation method and apparatus based on a large model, and an electronic device, a storage medium, a computer program product and a computer program. The method comprises: acquiring a sample data set; inputting input data and prompt information into a large model, so as to acquire first content generated by the large model; converting the first content into a first prediction result which has the same type as a labeling result; inputting the input data into an initial student model, so as to acquire a second prediction result that is output by the initial student model; determining a correction gradient on the basis of respective differences between the second prediction result and the first prediction result and between the second prediction result and the labeling result; and on the basis of the correction gradient, correcting the initial student model, so as to acquire a target student model.
The present disclosure relates to the technical field of artificial intelligence, and specifically relates to the fields of deep learning and computer vision. Provided are a multi-objective optimization method and apparatus, and a device and a storage medium. The method comprises: acquiring a set of values to be quantized and a set of objectives to be optimized; for each value to be quantized in said set of values, acquiring a quantization coefficient corresponding to said value, and determining a set of adjacent quantization coefficients of the quantization coefficient; on the basis of a set of reconstruction values corresponding to the set of adjacent quantization coefficients, determining a set of reconstruction distortion values; and on the basis of the set of reconstruction distortion values and said set of objectives, determining a target quantization coefficient. The multi-objective optimization method provided in the present disclosure improves the performance of other optimization objectives while ensuring the performance of traditional optimization objectives.
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/762 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
The present disclosure provides a data operation method, apparatus, device, and storage medium, which relates to the technical fields of distributed file system, in particular, to the technical fields of multi-version concurrency control and log-structured merge tree. The specific implementation scheme is as follows: obtaining a plurality of operation records on at least one piece of data in a file system; determining a target operation record of target data in the plurality of operation records, where the target operation record is a deletion record, and the target operation record is a latest operation record of the target data; and deleting at least one version of the target data in the file system according to the target operation record.
G06F 16/16 - File or folder operations, e.g. details of user interfaces specifically adapted to file systems
G06F 16/11 - File system administration, e.g. details of archiving or snapshots
25.
METHOD AND APPARATUS FOR GENERATING IMAGE ADVERSARIAL SAMPLE, METHOD AND APPARATUS FOR TRAINING IMAGE PROCESSING MODEL, IMAGE PROCESSING METHOD AND APPARATUS, AND DEVICE AND MEDIUM
Provided in the present application are a method and apparatus for generating an image adversarial sample, a method and apparatus for training an image processing model, an image processing method and apparatus, and a device and a medium. The method for generating an image adversarial sample comprises: acquiring an original image sample, and acquiring a feature vector map corresponding to the original image sample (S110); performing image scaling processing on the feature vector map according to an image scale of the original image sample, so as to obtain a standard-scale feature map (S120); using an attention mechanism network of a target type to process the standard-scale feature map, so as to obtain an attention influence map (S130); and on the basis of the attention influence map, adding a disturbance to the original image sample, so as to obtain an image adversarial sample corresponding to the original image sample (S140).
The present disclosure provides method and apparatus for generating 3D scene based on large language model, electronic device, and storage medium, which relates to the field of artificial intelligence technologies, particularly the fields of three-dimensional modeling technologies, large language model technologies, or the like. The three-dimensional scene generating method based on a large language model includes: processing description information of a target three-dimensional scene to obtain label information in the description information; generating query operation prompt of the LLM based on the label information, and acquiring a target asset set matched with the label information by the LLM based on the query operation prompt, the target asset set including a target asset in the target three-dimensional scene, target material information of the target asset and target scene attribute information of the target asset; and generating the target three-dimensional scene based on the target asset set.
The present disclosure provides method and apparatus for transferring facial expression of digital human, electronic device, and storage medium, which relates to the fields of augmented reality technologies, virtual reality technologies, computer vision technologies, deep learning technologies, or the like, and can be applied to scenarios, such as metaverse, a virtual digital human, or the like, An implementation includes: screening an identification of a target reference model matched with an object model from a preset reference model library; the reference model library including a plurality of reference models; acquiring an expression library of the target reference model based on the identification of the target reference model; and transferring a last frame of an expression in the expression library of the target reference model into the object model to obtain a last frame of an expression of the object model.
Provided is a method for processing video coding. The method includes: according to domain image blocks of a target image block in a video frame, determining whether the target image block belongs to a candidate caption region; in response to determining that the target image block belongs to the candidate caption region, generating a pixel histogram of the target image block; according to the pixel histogram of the target image block, determining a region type to which the target image block belongs, where the region type is a caption region or a non-caption region; and according to the region type to which the target image block belongs, determining a target coding mode for the target image block.
H04N 19/132 - Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
H04N 19/136 - Incoming video signal characteristics or properties
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/593 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
29.
METHOD AND APPARATUS FOR IMAGE PROCESSING, ELECTRONIC DEVICE AND STORAGE MEDIUM
A method for image processing, including: obtaining an image to be processed; determining a portrait area in the image to be processed, and cropping, based on the portrait area, a target background image from the image to be processed; obtaining a portrait by performing portrait matting on the image to be processed and obtaining an enlarged portrait by enlarging the portrait, wherein a height of the enlarged portrait is greater than a height of the target background image; and generating, based on the target background image and the enlarged portrait, a target image.
Provided is a data query optimization method, an electronic device and a storage medium, relating to the field of data processing technology and in particular to the technical fields of distributed database, big data, cloud computing and others. The method includes: determining a plurality of candidate execution plans for a target query request; determining execution costs of the plurality of candidate execution plans; updating the execution costs of the plurality of candidate execution plans based on monitoring data of data nodes involved in the plurality of candidate execution plans, to obtain final costs of the plurality of candidate execution plans; and screening out a final execution plan for the target query request from the plurality of candidate execution plans based on the final costs of the plurality of candidate execution plans.
Provided is a method for processing an oracle region cache, an electronic device and a storage medium, relating to the field of data processing technology, and in particular to the fields of big data, cloud computing, distributed database, intelligent search and other technologies. The method includes: obtaining a benefit parameter of a region to be processed, wherein the benefit parameter is used to represent a difference between benefit and cost of setting the region to be processed in the oracle region cache; and selecting the region to be processed to update the oracle region cache when the benefit parameter of the region to be processed meets a target condition.
A digital human generation method, an electronic device and a storage medium are disclosed. The solution relates to the fields of augmented reality technologies, virtual reality technologies, computer vision technologies, deep learning technologies, or the like, and can be applied to scenarios, such as metaverse, a virtual digital human, or the like. An implementation includes: acquiring a corresponding target object model based on a picture of a to-be-generated digital human; acquiring a corresponding point cloud of a head key feature in the picture from a pre-configured feature library based on the head key feature; and fusing the point cloud of the head key feature in the target object model to obtain a digital human figure.
The disclosure provides a code completion method based on a big model. The method includes: determining a first code element where a position to be completed is located in a first code file to be completed; determining a second code file having a dependency relationship with the first code file from a development project to which the first code file belongs; determining, according to the first code element, a second code element whose correlation with the first code element meets a preset condition, in which the second code element belongs to at least one of the first code file or the second code file; and generating a target code corresponding to the position to be completed through a big model based on a signature of the second code element.
The present disclosure relates to the technical field of artificial intelligence, in particular to the technical fields such as intelligent office, cloud computing, generative dialogue systems, and large language models (LLMs), and provides an LLM-based data query method and apparatus, a device, and a storage medium. The LLM-based data query method comprises: determining a target data table from among candidate data tables on the basis of a query question, wherein the target data table comprises candidate attributes; determining a target attribute from among the candidate attributes on the basis of the query question; generating query instruction prompt information of an LLM on the basis of the query question, table information of the target data table, and attribute information of the target attribute, and using the LLM to generate a query instruction on the basis of the query instruction prompt information; and on the basis of the query instruction, querying from within the target data table to obtain a query answer corresponding to the query question. The present disclosure can improve the data query efficiency and accuracy.
A method and apparatus for processing an access request, and a computer readable storage medium are provided. The method includes acquiring identification information and an IP address of an access account from an authentication message; determining permission configuration information matching the identification information; generating an access control entry based on the permission configuration information and the IP address; and processing an access request of an access account based on an access control entry.
Provided is a method of deploying a multimodal large model, an electronic device and a storage medium, relating to field of artificial intelligence technology, and in particular, to fields of deep learning and model deployment. The method includes: splitting a first multimodal large model into a visual part and a linguistic part; determining a first static graph model corresponding to the visual part and a second static graph model corresponding to the linguistic part; and deploying the first multimodal large model based on the first static graph model and the second static graph model.
A method for obtaining a cover image includes: obtaining a plurality of first cropped images of an original image corresponding to a candidate resource; obtaining an aesthetic score of each of the plurality of first cropped images; and determining a target cover image of the candidate resource from the plurality of first cropped images based on the aesthetic score of each first cropped image.
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 30/18 - Extraction of features or characteristics of the image
G06V 30/262 - Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
38.
METHOD AND APPARATUS FOR GENERATING COMMENT INFORMATION BASED ON LARGE MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM
The disclosure provides a method and an apparatus for generating comment information based on a large model, an electronic device and a storage medium, relates to a technical field of artificial intelligence, and in particular to the technical fields of deep learning, large model, and natural language processing, and the like. The specific technical solution includes: obtaining description information of a resource to be commented on by understanding, based on the large model, the resource to be commented on; obtaining, based on the description information, comment information of the resource to be commented on, in which the comment information includes at least a comment video of the resource to be commented on; and displaying the comment video in a comment section. The intelligent generation of comment videos and texts is realized, improving the accuracy of the comment information, simplifying the comment generation process, and improving the speed of generating comments. Further, by introducing a video comment format, more diverse comment formats are provided for users to select from, greatly enhancing the user experience.
H04N 21/4788 - Supplemental services, e.g. displaying phone caller identification or shopping application communicating with other users, e.g. chatting
G11B 27/031 - Electronic editing of digitised analogue information signals, e.g. audio or video signals
H04L 47/125 - Avoiding congestionRecovering from congestion by balancing the load, e.g. traffic engineering
39.
TRAFFIC LIGHT PREDICTION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM
A traffic light prediction method, an apparatus, and an autonomous vehicle are provided. The method includes determining lane line information of a lane where the vehicle is located and information of a target traffic light corresponding to the lane based on current position information of the vehicle, and recording the lane line formation and the information of the target traffic light as element information; recognizing an obstacle in the image acquired by the vehicle to obtain obstacle information; and associating element information with obstacle information to generate topology information, where the topology information is used to represent a binding relationship among a target traffic light, a lane line, and an obstacle; and generating a prediction result of the target traffic light based on the element information, the obstacle information, and the topology information.
G06V 20/58 - Recognition of moving objects or obstacles, e.g. vehicles or pedestriansRecognition of traffic objects, e.g. traffic signs, traffic lights or roads
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/56 - Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
40.
METHOD FOR TRAINING IMAGE CROPPING MODEL, METHOD FOR PROCESSING IMAGE, ELECTRONIC DEVICE AND STORAGE
Provided is a method for training an image cropping model, a method for processing an image, an electronic device and a storage medium, relating to the field of deep learning and image processing technology. The training method includes: obtaining sample data, wherein the sample data at least includes: a sample image, a first cropped image obtained by cropping the sample image in a first manner, and a second cropped image obtained by cropping the sample image in a second manner; determining a target loss function; and using at least the sample data and the target loss function to perform model training on a preset image cropping model to obtain a target image cropping model.
G06V 10/32 - Normalisation of the pattern dimensions
G06V 10/42 - Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
G06V 10/74 - Image or video pattern matchingProximity measures in feature spaces
G06V 10/766 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
G06V 10/77 - Processing image or video features in feature spacesArrangements for image or video recognition or understanding using pattern recognition or machine learning using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]Blind source separation
G06V 10/774 - Generating sets of training patternsBootstrap methods, e.g. bagging or boosting
G06V 40/10 - Human or animal bodies, e.g. vehicle occupants or pedestriansBody parts, e.g. hands
41.
METHOD FOR INFORMATION PROCESSING, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A computer-implemented method for information processing includes: obtaining text information, in which the text information includes first text information of a resource to be commented on and second text information of a candidate prompt; selecting a target prompt from the candidate prompts based on the text information; and generating comment information of the resource to be commented on, based on the resource to be commented on and the target prompt.
A method for model training based on a large model includes: determining a first large model as a teacher model of a language model, and performing distillation learning on the language model based on the first large model; inputting a first prompt text into the language model, and obtaining a plurality of first response texts for the first prompt text output by the language model; determining a reference response text for the first prompt text from the plurality of first response texts; and training the language model based on the reference response text for the first prompt text.
A large model-based recommendation method includes: determining description information of interested content corresponding to a target user; inputting a content to be recommended, the description information of interested content and current popular search sentences into a large model to generate at least one recommendation card corresponding to the content to be recommended, in which the recommendation card contains a recommendation word associated with the content to be recommended; obtaining a current behavior characteristic of the target user; and in response to the current behavior characteristic satisfying a display condition of the recommendation card, displaying the recommendation card corresponding to at least one content to be recommended.
There is provided a method for video processing, an electronic device, and a storage medium, which relates to the technical field of image processing, specifically to technical fields such as digital video and image display, which may be used in intelligent cloud and cloud computing scenarios. A specific implementation solution involves: acquiring ambient brightness data of a display device, the display device adopting a standard dynamic range (SDR) technology; obtaining screen brightness data of the display device according to video brightness data of to-be-displayed high dynamic range (HDR) video, metadata of the HDR video, and the ambient brightness data; wherein the video brightness data is obtained by tone mapping according to the metadata; and controlling, by using the screen brightness data, the display device to display the HDR video.
G06T 5/92 - Dynamic range modification of images or parts thereof based on global image properties
G06T 5/90 - Dynamic range modification of images or parts thereof
G06V 10/60 - Extraction of image or video features relating to illumination properties, e.g. using a reflectance or lighting model
G09G 3/22 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix using controlled light sources
45.
METHOD OF DETERMINING METEOROLOGICAL INFORMATION, ELECTRONIC DEVICE AND STORAGE MEDIUM
A method of determining meteorological information, an electronic device and a storage medium are provided, which relate to a field of artificial intelligence technology, and in particular to fields of deep learning and large models. The method includes performing a feature extraction on meteorological raster data of a target region within a target time period to obtain a meteorological feature vector; inputting to-be-processed meteorological data of the target region within the target time period into a large language model to obtain a text summary including a meteorological information determination manner; performing an information enhancement processing on the meteorological feature vector by using the text summary to obtain an information enhancement result; and performing a self-attention processing on the information enhancement result to obtain a meteorological information determination result output for the to-be-processed meteorological data.
A method of generating a content based on a large model, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, and in particular to fields of deep learning, natural language processing, computer vision, large models, etc. The method includes performing an intention recognition on an input information in response to receiving the input information; generating a painting knowledge text by invoking a multimodal large model based on an intention for painting knowledge acquisition in response to recognizing the intention for painting knowledge acquisition from the input information; generating a first driving voice and a first action instruction for driving a virtual character according to the painting knowledge text; and broadcasting the painting knowledge text by driving the virtual character according to the first driving voice and the first action instruction.
Provided is a performance optimization method for a model training device, an electronic device, and a storage medium, relating to the fields of deep learning, large model training, and distributed parallel strategies. The method includes: determining communication timing of a current model training device with respect to a target model block at a target sorting position, so as to be able to perform synchronously collective communication with other model training devices of a plurality of model training devices with respect to model blocks at the target sorting position; and performing the collective communication on a backward gradient of the target model block at the communication timing.
A method for processing a query-response information is provided, which relates to a field of artificial intelligence technology, and in particular to fields of deep learning, large models, intelligent query and response, etc. The method for processing a query-response information includes: generating at least one initial response information according to a query information provided by an object; acquiring at least one feedback information corresponding to the at least one initial response information, wherein the feedback information indicates a preference degree of the object for the initial response information; and generating a training sample according to the query information, the at least one initial response information and the at least one feedback information. The present disclosure further provides a method for training a conversational model, an electronic device, and a storage medium.
A method for information processing, is performed by an electronic device, and the method includes: obtaining a residue sequence AT that does not carry amino acid information and a first protein backbone structure BT generated by pure noise; and performing iterative denoising on the residue sequence AT and the first protein backbone structure BT; for a tth denoising, obtaining coevolution information of a residue sequence AT+1−t, and obtaining, based on the coevolution information and a first protein backbone structure BT+1−t, a residue sequence AT−t and a first protein backbone structure BT−t after the tth denoising, until the denoising is completed and a target amino acid sequence and a second protein backbone structure are obtained, where t is a positive integer, and 1≤t≤T, and T is a number of denoising times.
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids
Data query method and apparatus based on large model, an electronic device, and a storage medium are disclosed, which relates to the field of artificial intelligence, specifically in natural language processing, deep learning, and large model technologies, applicable to scenarios such as dialogue systems and information retrieval. The method includes: performing entity recognition on a query to obtain the target entity in the query; obtaining a first related content associated with the target entity from internal information, and performing data analysis on the first related content using a large language model (LLM) to obtain a data analysis result; obtaining a second related content associated with the target entity from external information, and performing data generation on the second related content using the LLM to obtain a data generation result; obtaining a query result corresponding to the query based on the data analysis result and the data generation result.
The disclosure provides a method for optimizing content generated by a large model, an apparatus for optimizing content generated by a large model, an electronic device and a storage medium, and relates to the technical field of artificial intelligence, especially to the technical fields of text processing, large language model and the like. It can be applied to official document processing, automatic contract generation, legal document writing, enterprise internal system management and so on. The method includes: obtaining a question entered by a user, wherein the question is used to instruct a generation of a text of a target type; obtaining a set of target rules corresponding to the target type from a plurality of preset sets of rules, in which the set of target rules includes a plurality of target rules, and the target rules are rules followed by the target type of text; and according to a sequence of the target rules, inputting the plurality of target rules into a large language model sequentially to obtain a target text of the target type generated by the large language model. In this way, the accuracy of generating text following certain rules by the large language model is improved.
A method for generating a dialogue includes acquiring a current first question statement and historical dialogue information associated with the first question statement; acquiring, from a knowledge base, a first knowledge item associated with the first question statement and a second knowledge item having a question-answer relationship with the first knowledge item; obtaining a first reply statement output by a generative model by inputting the first question statement, the first knowledge item, and the historical dialogue information into the generative model; evaluating the first reply statement based on the first question statement, the first knowledge item, and the second knowledge item; and outputting the first reply statement in response to the first reply statement passing evaluation.
A training method and apparatus for a full atomic structure prediction model. The method includes: obtaining structural information of a biomolecule and a first dynamic trajectory of the biomolecules; in which, the first dynamic trajectory includes position information of atoms in the biomolecule at different time points; adding noise to the first dynamic trajectory to obtain a second dynamic trajectory; encoding the structural information to obtain encoded features; decoding the encoded features and the second dynamic trajectory to obtain a target dynamic trajectory; and training an initial full atomic structure prediction model based on a difference between the target dynamic trajectory and the first dynamic trajectory, to obtain the full atomic structure prediction model.
The present disclosure relates to the technical field of artificial intelligence, and in particular relates to the technical fields of deep learning, natural language processing, etc. Provided are a model compression method and apparatus, a training method and apparatus, and a text data processing method and apparatus. A specific implementation of the model compression method involves: according to the number of concurrently deployed calculation units, dividing initial model parameters of a model to be compressed, so as to obtain initial local model parameters corresponding to the plurality of calculation units, wherein the calculation units are used for processing the same task associated with the model to be compressed; according to an initial input activation value, rematching the correspondence between initial weight parameters and the calculation units, so as to obtain target local model parameters corresponding to the plurality of calculation units, wherein a target input activation value among the target local model parameters gradually increases in a data processing direction; and performing quantization on the target local model parameters, so as to obtain a compressed model.
A method is provided that includes: obtaining a reference image and a description text; extracting a text feature of the description text; and performing the following operations based on a pre-trained diffusion model to generate a target image: in each time step of the diffusion model: calculating a first cross-attention feature of a first image feature and the text feature; obtaining a second cross-attention feature of a second image feature of the reference image and the text feature; editing the first cross-attention feature based on the second cross-attention feature to obtain a third cross-attention feature; and generating a result image feature of the time step based on the third cross-attention feature and the text feature; and decoding a result image feature of a last time step to generate the target image.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
56.
MODEL TRAINING METHOD, MODEL REASONING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
Provided is a model training method, a model reasoning method, an electronic device, and a storage medium, relating to the field of data processing, and especially to the technical fields of artificial intelligence, big data, deep learning and large models. The model training method includes: folding an initial token sequence for training a model based on a folding feature value for folding a token sequence to obtain at least a first token sequence subjected to the folding, wherein the initial token sequence represents a token sequence composed of T1 tokens, and the first token sequence has a sequence length less than that of the initial token sequence; and inputting at least the first token sequence into a preset model to train the preset model so as to obtain a target model.
Provided is a large language model training method, an electronic device and a storage medium, relating to the field of artificial intelligence technologies, and in particular, to the fields of deep learning, natural language processing and large model. The method includes: performing dimension reduction parameter fusion on a two-dimensional parameter matrix on each channel in each network layer in a first large language model, respectively, to obtain a second large language model; performing layer reduction parameter fusion on network layers in the second large language model based on a three-dimensional parameter matrix of each network layer in the second large language model to obtain a third large language model; and training the third large language model to obtain a target large language model under the condition that the target loss function determined based on the first and third large language models meets a preset first function condition.
Data processing method and apparatus, an electronic device, and a storage medium are disclosed, which is in the fields of artificial intelligence, such as distributed storage and cloud computing. The method includes: determining a priority of each placement group in a cache pool respectively, and dividing placement groups with the same priority into a same waiting queue; constructing a target queue which is initially empty, and in response to determining that a supplementary trigger condition is met, determining placement groups to be retrieved based on the principle that a placement group in a waiting queue with higher priority is retrieved first, retrieving the placement groups to be retrieved from the corresponding waiting queue and adding the placement groups to be retrieved to the target queue; and in response to determining that the target queue is not empty, iteratively traversing each placement group in the target queue, wherein when traversing each placement group, the placement group is used as a target placement group respectively, and the number of writable objects is determined as a first quantity, and the first quantity of objects retrieved from the target placement group is written to a backend pool.
A multimodal data generation method is provided. The method includes: inputting a query data sequence into a multimodal model, to obtain a plurality of tokens in a response data sequence, where a current token is generated through the following operations: inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model generates the current token based on the query data sequence and the current response data sequence, in response to determining that the current token belongs to a first data modality; or inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model denoises an initial token sequence based on the query data sequence and the current response data sequence, to generate a result token sequence, in response to determining that the current token belongs to a second data modality.
A method for evaluating a large model, an electronic device and a computer readable storage medium are provided, which relate to a field of artificial intelligence technology, and in particular to fields of large models technology and deep learning technology. The method includes: evaluating a response information of each of M large language models for an input instruction based on a preset evaluation rule, so as to obtain a first evaluation information for each response information, where M is a positive integer greater than 1; evaluating, in response to the first evaluation information for the M large language models being consistent with each other, each response information in a plurality of evaluation dimensions, so as to obtain a second evaluation information for each response information; and determining an evaluation result representing a responsiveness of each large language model, according to the second evaluation information for each response information.
A method and an apparatus for optimizing an mRNA sequence, an mRNA molecule, a pharmaceutical composition, and a use thereof are provided. The disclosure relates to the technical field of artificial intelligence, specifically to technical fields such as biological computing. The method for optimizing the mRNA sequence include: obtaining a first mRNA sequence for synthesizing a protein of interest, where the first mRNA sequence includes a 5′ untranslated region sequence and a coding region sequence; and adjusting the 5′ untranslated region sequence and the coding region sequence with the goal of maximizing a first score of the first mRNA sequence, so as to obtain an optimized second mRNA sequence for synthesizing the protein of interest, where the first score reflects at least one of the following indicators of the first mRNA sequence: translation initiation efficiency, codon adaptation index, and minimum free energy.
A vehicle operating system (VOS) in an autonomous driving vehicle (ADV) can communicate with a cloud platform to automatically train AI models. The VOS collects real-time data from the ADV, and generates inference data based on the real-time data using a teacher edge model of an AI model and generates second inference data based on the real-time data using a student edge model of the AI model. The VOS then obtains one or more differences between the first inference data and the second inference data, and retrains the student edge model of the AI model based on the one or more differences. Both real-time data and the retrained student edge model are uploaded to a cloud platform for use in upgrading the student edge model and the teacher edge model on the cloud platform. The upgraded teacher edge model and the student edge model can be redeployed over-the-air (OTA) through a software define process. The above process of training AI models can be repeated in a closed-loop automatically without user intervention.
A method and apparatus for accessing a network function virtualization controller by a network element are provided. The method includes: creating at least one service unit in a region to which a network element belongs; associating the at least one service unit with a system VPC corresponding to the network element; associating at least one availability zone comprised in the service unit with at least one device pool and at least one subnet respectively, where the device pool is formed by aggregating at least one virtual network element device; associating the at least one device pool with the at least one subnet based on an IP corresponding; and accessing, by the at least one service unit, a network function virtualization controller deployed in the region to which the network element belongs.
H04L 41/342 - Signalling channels for network management communication between virtual entities, e.g. orchestrators, SDN or NFV entities
H04L 41/40 - Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
H04L 67/60 - Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
64.
METHOD AND APPARATUS FOR ACCESSING VIRTUAL PRIVATE CLOUD, DEVICE AND STORAGE MEDIUM
The present disclosure provides a method and apparatus for accessing a virtual private cloud (VPC), a device and a storage medium, which are applied to the field of cloud computing, intelligent search, Internet of Things and others technical fields in data processing. The method includes: receiving a first access request, where the first access request carries virtual address information, the first access request is used for indicating access to a target service node arranged in a target VPC; the target service node has actual address information; the virtual address information is used for indicating actual address information of the target VPC and the target service node, and the actual address information is a real address of a service node; accessing the target service node according to the first access request.
A method of generating a code based on a large model, an electronic device and a storage medium are provided, which relate to the field of artificial intelligence technology, in particular to the fields of deep learning technology and large model technology. The method includes: acquiring a first descriptive text input by a user, where the first descriptive text is configured to characterize a code requirement; searching for a positive code and a negative code matching the first descriptive text, where each of the positive code and the negative code is determined based on a preference operation of the user for a historical code output by the large model; generating a second descriptive text according to the first descriptive text, the positive code, and the negative code; and inputting the second descriptive text into the large model to output a target code matching the code requirement.
A method for obtaining an antibody sequence includes: obtaining first features of amino acids at different sequence positions according to an antigen multiple sequence alignment (MSA) sequence, an antibody MSA sequence, and a concatenated sequence of the antigen MSA sequence and the antibody MSA sequence; obtaining second feature of the amino acids at different 3D coordinates according to a graph constructed according to a reference antigen-antibody complex; fusing the first features of amino acids at different sequence positions with the second features of amino acids at 3D coordinates corresponding to the different sequence positions, and obtaining probability information of each of the amino acids at different positions in the antibody sequence according to fused features; and obtaining a target antibody sequence according to the amino acids and their probability information at different positions in the antibody sequence.
G16B 40/00 - ICT specially adapted for biostatisticsICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
67.
TASK EXECUTION METHOD FOR LARGE MODEL, DEVICE, AND MEDIUM
A task execution method for a large model, an electronic device, and a storage medium are provided, which relate to a field of artificial intelligence technology, particularly to fields of deep learning technology and large model technology. The method includes: executing a modality routing task by using a target computing unit based on a target feature to be processed to obtain a modality recognition result; executing a field routing task by using the target computing unit based on the target feature to be processed and a target field gating model parameter to obtain a field recognition result; and executing a feedforward task by using the target computing unit based on the target feature to be processed and a target feedforward task model parameter to obtain a task execution result
A task execution method for a large model relates to fields of artificial intelligence, deep learning and large model technologies, and includes executing attention tasks in a task group to be fused using a target computing unit to obtain attention features, where the attention task corresponds to a weighted matrix to be fused, the weighted matrix to be fused is obtained by weighting a matrix to be fused using a weight; obtaining a processing result according to the attention features; determining a loss information according to the processing result; and weighting and fusing matrices to be fused using the target computing unit according to weights for the task group to be fused if the loss information converges, to obtain a fusion matrix for a target task group, where a target task in the target task group is executed by the target computing unit according to the fusion matrix.
An information processing method. The method includes obtaining a first bilingual sentence pair, in which the first bilingual sentence pair comprises a source language sentence and a target language sentence; and obtaining a distilled second bilingual sentence pair by distilling a first language sentence in the first bilingual sentence pair based on a large language model (LLM), in which the first language sentence is the source language sentence or the target language sentence.
An annotation method for a large language model, an electronic device, and a medium are provided. The method may include: obtaining a plurality of response texts that are generated by a large language model for a request text and that meet a difference requirement; obtaining a plurality of scores corresponding to the plurality of response texts, where each of the plurality of scores indicates a degree to which a corresponding response text in the plurality of response texts matches the request text; and obtaining an annotated text for at least one of the plurality of response texts based on the plurality of scores, where the annotated text is used to adjust a parameter of the large language model.
A query answering method, an electronic device, a storage medium, and an intelligent agent are provided, which relate to a field of artificial intelligence technology, and in particular to fields of large model, intelligent search and information processing technology. The method includes: inputting, in response to a retrieval content set retrieved based on a query, the query, the retrieval content set and prompt information for answer generation into the large model, so that the large model performs operations of: processing, based on a current task in the prompt information and the query, a current text corresponding to the retrieval content set to obtain a processed text, where the current task is determined based on a task execution order in the prompt information; and obtaining, in a case of determining that the processed text meets a preset condition, an answer to the query based on the processed text.
A large model-based method of generating a text, a method of training a text generation model, a device, and a medium are provided, which relate to a field of artificial intelligence technology, specifically to fields of deep learning, natural language processing and large model technologies. The large model-based method of generating a text includes: acquiring a memory state for a text to be processed, where the memory state is generated based on a previous text of the text to be processed; determining an embedding feature of the text to be processed as an initial hidden state, and processing the memory state and the initial hidden state by using a first attention mechanism to obtain an updated hidden state; and generating a subsequent text for the text to be processed based on the updated hidden state.
The present application provides a compound for inhibiting protein-protein interaction between CDK4/6 and cyclin D. The compound has the result shown in formula (I), formula (II), formula (III), formula (IV), formula (V) or formula (VI), and has a use in preparation of drugs for diseases that benefit from or can be treated by inhibiting protein-protein interaction between CDK4/6 and cyclin D.
C07J 9/00 - Normal steroids containing carbon, hydrogen, halogen, or oxygen, substituted in position 17 beta by a chain of more than two carbon atoms, e.g. cholane, cholestane, coprostane
C07D 403/02 - Heterocyclic compounds containing two or more hetero rings, having nitrogen atoms as the only ring hetero atoms, not provided for by group containing two hetero rings
C07D 413/00 - Heterocyclic compounds containing two or more hetero rings, at least one ring having nitrogen and oxygen atoms as the only ring hetero atoms
In one embodiment, a method for controlling PCIe devices on an autonomous driving system (ADS) of an autonomous driving vehicle (ADV) is disclosed. The method includes scanning PCIe ports of the ADS to discover PCIe device(s) mounted on the ADS. Next, a comparison is performed between a list of the discovered PCIe device(s) and an expected list of PCIe devices for the ADS. Next, an offline PCIe device is determined based on the comparison, the offline PCIe device corresponding to a PCIe device that is present in the expected list of PCIe devices but not in the list of the discovered PCIe devices. Then a device reset command is transmitted to a programmable logic device that manages power for PCIe devices of the ADS to reset the offline PCIe device, wherein the programmable logic device generates a reset control signal for the offline PCIe device to reset the PCIe device.
A cloud network system, a cloud network message processing method, and an electronic device are provided, which relate to the field of artificial intelligence technology, specifically to the fields of cloud networks and network security, and may be applied to intelligent cloud scenarios. The cloud network message processing method includes: obtaining a cloud network message, where the cloud network message is sent from a source end to a cloud security device; determining a target security device for the cloud network message from pre-configured multiple types of candidate security devices, where the candidate security devices include a built-in security device inside the cloud security device and a third-party security device outside the cloud security device; sending the cloud network message to the target security device for security processing, and sending the security-processed cloud network message from the target security device to a destination end.
The present disclosure provides a method of voice chip implementation, a voice chip, an intelligent voice product, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as intelligent voice and AI chips. The method may include: constructing a voice chip including a first Digital Signal Processor (DSP) and a second DSP, the first DSP and the second DSP corresponding to a same Digital Signal Processor core Identifier (DSP core IP) and adopting heterogeneous designs; and completing a chip processing function in a corresponding intelligent voice product by using the voice chip, wherein different functions are completed by using the first DSP and the second DSP respectively. By use of the solutions of the present disclosure, implementation costs can be reduced.
A printed circuit board (PCB) includes one or more signal layers in between a first surface and a second surface, the one or more signal layers including a first signal layer. The PCB includes a connector interface disposed on the first surface and a first signal via electrically coupled to the connector interface. The PCB includes first transmission line disposed at the first signal layer electrically coupling a first location to a second location of the first signal layer and a second signal via disposed at the second location. The PCB includes a power-over-cable circuit disposed on the first surface and electrically coupled to the second signal via and a receiver circuit disposed on the first surface adjacent to the power-over-cable circuit. The PCB includes a second transmission line disposed on the first surface layer electrically coupling the power-over-cable circuit to the receiver circuit.
The present disclosure provides a data marking method, apparatus, system, device, and storage medium, and relates to the technical field of data processing, and in particular to fields such as artificial intelligence, big data, and deep learning. The specific implementation solution is as follows: acquiring multiple pictures whose contents are continuous, wherein the multiple pictures contain at least one same object; for each object, determining a position offset of the object by using position information of the object in two adjacent pictures, wherein the two adjacent pictures include a first previous picture and a second previous picture, the second previous picture is a picture before and adjacent to a picture to be marked in time sequence; the first previous picture is a picture before and adjacent to the second previous picture in time sequence; determining estimated position information of the object in the picture to be marked based on the position information of the second previous picture and the position offset; marking the object in the picture to be marked based on the estimated position information. The present disclosure can speed up the marking of the same object in multiple pictures.
A method for invoking a plugin of a large language model includes: acquiring natural language content; performing semantic understanding on the natural language content and detecting whether the natural language content hits a plugin to obtain a first plugin pointed to by the plugin hit result; comparing the first plugin with a second plugin corresponding to the current session understanding task to determine a to-be-executed session understanding task and a third plugin corresponding to the to-be-executed session understanding task; acquiring the language understanding content of the to-be-executed session understanding task and sending the language understanding content to the large language model to obtain the input parameter of the third plugin; and calling the third plugin according to the input parameter of the third plugin to obtain the calling result of the to-be-executed session understanding task.
A speech recognition method and a method for training a deep learning model are provided. The speech recognition method includes: obtaining a first speech feature of a speech to-be-recognized, which includes a plurality of speech segment features corresponding to a plurality of speech segments; decoding the first speech feature using a first decoder to obtain a plurality of first decoding results corresponding to a plurality of the words, indicating a first recognition result of words; extracting a second speech feature from the first speech feature based on first a priori information, which includes the plurality of first decoding results, and the second speech feature includes first word-level audio features corresponding to the plurality of words; and decoding the second speech feature using a second decoder to obtain a plurality of second decoding results corresponding to the plurality of words, indicating a second recognition result of the word.
A method is provided that includes: obtaining an editing instruction input by a user in a current round of a dialogue and history dialogue information in at least one history round of the dialogue, wherein the history dialogue information comprises a history dialogue text and at least one history image; determining a source image to be edited from the at least one history image based on the editing instruction and the history dialogue information; and editing the source image to generate a target image based on the editing instruction.
A method for sorting search results, includes: acquiring a plurality of first search results corresponding to a search request and an initial order of the plurality of first search results; determining a type and a style of each of at least one first search result in response to the at least one first search result being push information; based on a preset mapping relationship, determining a first score corresponding to each of the at least one first search result according to the type and the style of each of the at least one first search result; and acquiring an updated order of the plurality of first search results by adjusting the initial order in order of the first score from largest to smallest.
A method of identifying a webpage, a device, and a medium are provided, which relate to a field of an artificial intelligence technology, in particular to fields of deep learning, knowledge graph and other technologies. The method of identifying the webpage includes: acquiring structural data of a target webpage, a first association relationship between the target webpage and a historical webpage, and historical graph data for the historical webpage; determining target graph data for the target webpage and the historical webpage based on the structural data of the target webpage, the first association relationship and the historical graph data; determining a similarity between the target webpage and the historical webpage based on the target graph data; and determining a category of the target webpage based on a category of the historical webpage and the similarity.
Disclosed are a method for generating point cloud data, an electronic device and a storage medium. The method includes: acquiring a set of real point clouds for a target object based on a LiDAR; performing image acquisition on the target object, and generating a set of pseudo point clouds based on an acquired image; and generating the set of target point clouds for model training by fusing the set of real point clouds and the set of pseudo point clouds.
A training method, an inference method, a device, an apparatus, and a medium for a deep learning model are provided. A first model includes a plurality of first parameters, a second model comprises a plurality of second parameters, which is initialized to parameter values of a plurality of target parameters selected from the plurality of first parameters. The training method includes: determining a target loss for both the first model and the second model; adjusting parameter values, including: in response to determining that the target loss indicates that the parameter values of at least part of the target parameters need to be adjusted, synchronously adjusting the parameter values of the corresponding second parameters; and in response to determining that the target loss indicates that the parameter values of at least part of the second parameters need to be adjusted, synchronously adjusting the parameter values of the corresponding target parameters.
G06N 3/043 - Architecture, e.g. interconnection topology based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS]
A data generation method is provided. The data generation method includes: generating first answer data based on first question data from a user; determining, in response to receiving negative feedback from the user for the first answer data, a first reflection result for the first answer data based on the first answer data and the negative feedback, wherein the first reflection result indicates a diagnosis reason why feedback from the user for the first answer data is negative; and generating second answer data for the first question data based on the first question data and the first reflection result.
A method of determining interaction information, an electronic device and a storage medium are provided, which relates to a field of artificial intelligence technology, in particular to a large model, a generative model, an NLP, an intelligent search and other fields. An implementation is to determine a plurality of questioning dimensions according to query information of a subject and historical query information, where each questioning dimension includes a dimension name and a plurality of options; determine a target questioning dimension from the plurality of questioning dimensions according to evaluation values of the plurality of questioning dimensions and whether semantic information of the plurality of questioning dimensions are consistent with semantic information of a query result associated with the query information; and determine the interaction information according to the dimension name and the plurality of options in the target questioning dimension.
A method is provided that includes: obtaining a list of browsed information of each of a plurality of first users and a first vector corresponding to each list of browsed information; clustering the first vectors to obtain one or more vector clusters and a center vector of each vector cluster; determining one or more information clusters respectively corresponding to the one or more vector clusters; obtaining a list of browsed information of a second user in response to a browsing request of the second user; determining, in response to determining that the list of browsed information of the second user is not void, a second vector corresponding to the list of browsed information of the second user; calculating a similarity between the second vector and each center vector, to determine an information cluster matched with the second vector; and providing recommendations for the second user based on the information cluster.
A method for training a speech translation model includes: obtaining a trained first text translation model and a speech recognition model, and constructing a candidate speech translation model to be trained based on the first text translation model and the speech recognition model; obtaining at least one of a first sample source language speech or a first sample source language text to obtain a training sample of the candidate speech translation model; and training the candidate speech translation model based on the training sample until the training is completed, and obtaining a trained target speech translation model.
The present disclosure provides a method and apparatus for training a reference frame screening model, a method and apparatus for screening a reference frame, an electronic device, a storage medium, and a computer program product, relates to the field of artificial intelligence. An implementation scheme is: acquiring a training sample set, where training samples in the training sample set include a sequence of video frames and labels corresponding to video frames in the sequence of video frames, and the labels are used to represent whether the video frames corresponding to the labels are reference frames of other video frames in the sequence of video frames; and training, using a machine learning method, using the sequence of video frames as input, using the labels corresponding to the input sequence of video frames as desired output, to obtain a reference frame screening model.
G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersectionsConnectivity analysis, e.g. of connected components
G06V 20/70 - Labelling scene content, e.g. deriving syntactic or semantic representations
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
91.
PROTEIN DOCKING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A method for protein docking includes docking a first protein and a second protein according to at least two molecular docking methods to generate a complex conformation in each round of iteration; recognizing that an iteration end condition is not satisfied, continuing a next round of iteration until the iteration end condition is satisfied, and obtaining a final complex conformation.
Provided a method for training a multi-task fusion detection model includes: obtaining a single-task detection model of each detection task in a detection task set, and obtaining an initial multi-task fusion detection model to be trained based on each single-task detection model; obtaining a training sampling set of the initial multi-task fusion detection model by obtaining a single-task sampling data set of each detection task, in which the training sample set includes a single-task sample and a multi-task sample; and training the initial multi-task fusion detection model according to the single-task sample and/or the multi-task sample until the training is completed, to obtain a trained target multi-task fusion detection model.
A copywriting generation method, an electronic device and a storage medium are provided and relate to a field of artificial intelligence technology, in particular to fields of deep learning and natural language processing technologies, and may be applied to scenarios of large language models and generative dialogues. The copywriting generation method includes: updating, in response to an input copywriting requirement information being received, a copywriting prompt information in the copywriting requirement information according to a copywriting generation operation related to the copywriting requirement information, so as to obtain a first target copywriting requirement information, where the first target copywriting requirement information includes a target copywriting prompt information related to a semantic attribute of the copywriting requirement information; and processing the first target copywriting requirement information based on a pre-trained deep learning model, so as to generate a first feedback copywriting corresponding to the copywriting requirement information.
The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication group, the communication group including a tensor-parallelism communication group, the tensor-parallelism communication group including at least two computing devices, tensor-parallelism segmentation being adopted for sparse parameters of each of the computing devices in a same tensor-parallelism communication group; and training an MoE model based on the communication group. By use of the solutions of the present disclosure, normal operation of model training can be guaranteed.
A cluster-based training method includes: in response to a hardware fault in the training node, selecting a target standby node from the plurality of standby nodes, and obtaining a target training snapshot of the model training task in the training node, in which the target training snapshot includes training state data of the model training task; and initializing the target standby node based on a container image of a model training program in the training node and the training state data to replace the training node with the target standby node to continue executing the model training task.
A data processing method, and a data processing model and a training method therefor are provided, and relate to the field of artificial intelligence, and specifically, to natural language processing, deep learning technologies, and large model technologies. An implementation solution includes: determining input data, where the input data includes a plurality of tokens; determining a correlation between each of the plurality of tokens and each of a plurality of expert networks based on a gating matrix, where the plurality of expert networks are used to reinforce the plurality of tokens; allocating the plurality of tokens to the plurality of expert networks in a uniform manner based on the correlation and a preset capacity of each expert network, to reinforce the plurality of tokens; and determining a data processing result based on the plurality of reinforced tokens.
A content delivery network-oriented method of displaying an information is provided, which may be applied to a field of Internet technology, and in particular to a field of content delivery network technology. The method includes: acquiring a business configuration information in response to a detection that a configuration control on a display page is triggered, where the business configuration information includes a business domain name and a configuration parameter; acquiring a task information in response to an operation for an abnormal resource information being detected, where the abnormal resource information is determined according to the business domain name and the configuration parameter; and displaying a processing result for the abnormal resource information according to the task information.
H04L 41/0853 - Retrieval of network configurationTracking network configuration history by actively collecting configuration information or by backing up configuration information
H04L 41/22 - Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks comprising specially adapted graphical user interfaces [GUI]
98.
IMAGE ENCODING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A method of encoding an image, an electronic device and a storage medium are provided, which relate to a field of image processing technology, in particular to fields of image encoding, video encoding and video compression technologies. The method includes: determining a target region containing an object from an original image, where the original image includes a plurality of image blocks; filtering, for each image block, the image block according to a positional relationship between the image block and the target region to obtain a filtered image block; determining a filtered image according to the filtered image block; and encoding the filtered image according to the target region.
H04N 19/80 - Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
H04N 19/119 - Adaptive subdivision aspects e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
H04N 19/147 - Data rate or code amount at the encoder output according to rate distortion criteria
H04N 19/157 - Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
H04N 19/167 - Position within a video image, e.g. region of interest [ROI]
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/20 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
99.
METHOD, DEVICE, EQUIPMENT AND STORAGE MEDIUM FOR REDIRECTING EDGE NODES FOR CLIENTS
A method for redirecting an edge node for a client includes: determining a peak bandwidth of a target region based on peak bandwidths of edge nodes in the target region, in which there are at least two edge nodes in the target region; determining a bandwidth ratio of each edge node based on the peak bandwidth of the target region and peak bandwidth of each edge node; obtaining a hash value of a data request address of a client request data by calculating the data request address using a preset algorithm; and determining one target edge node as an edge node of the client request data from the edge nodes of the target region based on the hash value and the bandwidth ratio of each edge node.
A human-machine interaction solution which relates to the field of artificial intelligence technologies, such as natural language processing technologies, large language models, deep learning technologies, or the like, is proposed. The solution may include: acquiring a question input by a user during a conversation with a large language model; retrieving memory information in a memory bank, the memory information being historical memory information about the user; and in response to retrieved memory information required for generating answer information corresponding to the question, taking the retrieved memory information as matched memory information, and generating the answer information by the large language model in conjunction with the matched memory information.