NUANCE COMMUNICATIONS, INC. (Burlington, MA)
It is a speech dialog system that comprises a first microphone, a secondary microphone, a processor and a memory. The first microphone captures the first audio recorded from a single spatial zone and creates an audio signal. The second microphone records the second audio from a different space and creates another sound signal. The processor receives the initial audio signal as well as the second audio signal and the memory contains instructions for the processor to instruct it to operate a speech enhancement module, an automatic speech recognition module, and a speech dialog module that performs a zone-dedicated speech dialog.
1. Field of the Disclosure
The present disclosure pertains to speech processing, and more particularly, to speech processing in a space with several spatial zones, where the zone from which the speech originates is crucial in considering the speech.
2. Description of Related Art
The approaches described in this section are possible approaches that could be pursued but are not necessarily methods that have been previously conceived or even pursued. The strategies discussed in this section are not necessarily prior art, and could not apply to claims in this application.
Multi-microphone speech applications require interaction with various components such as speech enhancer (SE) and an automatic speech recognition module (ASR), and an audio dialog (SD) (module). These components are often part of a framework which handles interactions between them. In modern systems, generally: (a) The SE is able to perform multi-channel speech enhancement in order to deliver an output signal that is of higher quality. Multi-channel speechenhancement may include acoustic echo cancellation, spatial filtering and noise reduction such as beamforming, signal separation, or cross-talk cancellation. While the SE offers a single output signal for ASR it may also include multi-input multiple output systems with multiple outputs. The output signal is usually transmitted in blocks of 16 milliseconds each. (b) The ASR is designed to recognize and detect speech utterances, e.g., awake-up-word (WuW) or a series of spoken words that are based on the input signal, yielding a recognition result. All references to “WuW” in the current document are intended to include words that wake up and other speech utterances. (c) The SD could further analyse the recognition results of the ASR and perform additional actions.
The SE can process input signals from multiple microphones and create a focal point in various spatial zones to record the signals of people (e.g. speakers) through spatial filtering. The SE can produce an output signal with spatial focus for listening with a selective focus, i.e. suppression of speech signals coming from other spatial areas. The SE can also combine signals from multiple spatial zones to produce an output signal which allows for broad listening.
A WuW or other speech utterances, may trigger zone-dedicated dialogs between users on systems with several microphones. The system is able to spatially search to the WuW while it is listening. When the detection of a WuW is detected and the system is able to proceed with a speech dialog in which case it is beneficial to only allow one spatial zone, and then assign a spatial focus to this spatial zone for further interaction with a person in that area. During this phase, other spatial zones can be ignored by the system i.e., selective listening.
In some systems , multiple microphones are covering multiple areas, e.g., seats in an automobile, for interaction with users who are located within the spatial zones. There is the possibility to conduct an audio dialogue only with one user, like to control the temperature of the air conditioner or heater for that seat. But, the information about the specific spatial zone that is displaying speech activity is available in the SE, but not available in the ASR or the SD. It is imperative for the SD be aware of the appropriate zone in order to create an appropriate zone-dedicated dialog.
Even if the system is capable of identifying the precise area in which the user spoke the WuW, it is possible for the SE to switch from listening in a broad mode to selective listening mode with a delay. This could be due to the ASR’s recognition process or other delays. If the user continues with the dialog immediately following speaking the WuW, the transition from listening broadly to selective listening might happen at the midpoint of the user’s speech utterance.
U.S. Pat. No. No. 10,229,686, that stands for “Methods and apparatus for speech segmentation with multiple metadata” It refers to the transmission of metadata to an ASR Engine for improved detection of speech beginning. But, it doesn’t mention sending zoneactivity data to help the ASR engine, or any other components of the system that are not part of an SE to determine the spatial area within which a speaker is in.
In some techniques of the prior technology, a selective-listening method is employed, where an SE emits one signal. Framework applications request information from the SE about the zone in which the listening should be changed. The SE’s internal processing mode SE can then be switched between broad listening and selective after the presence of a WuW has been detected. This technique has a drawback: the application framework must regulate the SE’s internal configuration for selective/broad listening modes. The interaction between the framework and SE requires added complexity on the performance of the application framework. Another issue with this technique is that the request to the SEto find out information about the area in which WuW was spoken WuW was spoken might be difficult to manage because of the latencies between the components of the system, and in the event that the components run on different computers the clock skews could be problematic.
Parallel WuW detectors can be used to increase robustness against speech interference. This technique allows an SE always delivers multiple spatially focused outputs that each refer to an individual selectivelistening mode for each of the zones in an array of several spatial zones. When a WuW phase is in progress, multiple instances of an ASR are running in parallel to operate on various output signals from the SE. The framework application selects one of the SE output signals to start a speech dialogue after a WuW is recognized. One drawback of this approach is that it demands an extremely high central processor unit (CPU) load due to numerous active ASR instances running in parallel.
The technical issue addressed by this disclosure is that an ASR can recognize speech utterances but cannot detect the spatial area in which the speech utterance was spoken, and therewith cannot distinguish between desired and interferingspeech components. The technical solution to this problem provided by the present invention is that the SE transmits spatial zone activity information along with an audio signal to the ASR for further processing and distinguishing different spatial zoneactivities.
Another technical problem addressed by this disclosure is that seamless transitions between broad listening and selective listening is not possible due to latencies in the detection of an utterance or recognition of a zone. A technicalsolution to this problem provided in the disclosure in question is to offer multiple audio streams, including selective and broad listening, from the SE to the ASR and buffer them to be in a position to “look back in time” and then resume the recognition within arelevant zone.
The present invention provides an audio-visual system that includes an initial microphone, a second microphone, a processor and a memory. The first microphone is able to capture the first audio signal from a particular spatial zone and generates a first audio signal.The second microphone captures second audio from a different spatial zone, and produces an additional audio signal. The processor receives the first audio signal as well as the second audio signal. the memory holds instructions to control the processorto perform operations that include: (a) a speech enhancement (SE) module that detects, from the first audio signal and second audio signal, speech activity in at least one spatial zones or in the second, thereby creating processed audio; and then determines from which first zone or the second zone where the audio processing is originating, and thus reveals zone activity information. (b) an automatic speech recognition (ASR) module that detects an utterance within the processed audio, which results in an recognized utterances; and basing on the zone activity information, produces a zone decision that identifies the first zone or the second zone the recognized utterance originated; and (c) an audio dialog (SD) module that creates a zone-dedicated speech dialogue in response to the utterance that was recognized and the zone choice.
Furthermore it is the SD module is based on the identified utterance , and the zone decision, decides from which of the first zone or the second zone to obtain additional audio, thus yielding the routing decision. The ASR module, based upon the routingdecision, obtains the additional audio from either of the first zone or the second zone and detects an additional utterance in the added audio.
Click here to view the patent on USPTO website.
Get Patents with PatentPC
What is a patent?
A patent is granted by the government to protect an invention. It gives the inventor the right to develop, utilize and market the invention. Society is benefited when new technologies are brought to the market. These benefits may be directly realized by people who are able to accomplish feats previously unattainable, or indirectly through the opportunities for economic growth that innovation offers (business expansion, job creation).
Patent protection is sought out by many pharmaceutical companies and university researchers to protect their research and development. A patent can cover the physical or abstract nature of a product or process or the method or composition of materials new to the area. Patent protection has to be granted to any invention that is valuable or novel and is not already known by others in the same field.
Patents give inventors a chance to be recognized for commercially viable inventions. They act as an incentive for inventors to invent. Small-scale businesses and inventors are sure that they will earn an income from their investment in technology development through patents. They can earn a living from their work.
Patents play a vital role in companies, and they can:
Secure your products and services.
Improve the value, the appearance, and visibility of your products market;
Differentiate your business and products from others;
Find out about business and technical information.
Avoid accidentally using third-party content or loosing valuable information, original outputs, or any other output of creativity.
Patents convert knowledge of the inventor into a marketable asset, which creates new opportunities for employment creation through joint ventures and licensing.
Investors in the development and commercialization of technology will find small companies with patent protection appealing.
Patents can result in innovative ideas and inventions. These information may be protected by patents.
Patents can serve as a deterrent to untrustworthy third parties that profit from the invention’s success.
Commercially successful patent-protected technology revenues can be used to finance technological research and development (R&D) that will boost the likelihood of improved technology in the near future.
Intellectual property ownership can be used to convince investors and lenders that there are real chances to commercialize your product. Sometimes, a single patent can lead to multiple financing options. Patents and other IP assets are able to be used as collateral or security for financing debt. Investors can also see the patents you own to increase the value of your business. Forbes and other publications have stated that each patent could increase the value of a company by anything from $500,000 to $1 million.
A well-constructed business plan is crucial for startups. It must be built on IP and explain how your product/service is distinctive. Investors will also be impressed if your IP rights are secured or in the process to becoming secure, and that they support your business plan.
It is crucial to keep an invention secret before submitting a patent application. The public disclosure of an invention before filing it could often erode its originality and make it patent-infringing. Disclosures that are filed prior to filing, like for investors, test-marketing or other business partners, should be done only after signing a confidentiality contract.
There are many types of patents. Understanding them is crucial to safeguard your invention. Utility patents cover new techniques and machines. Design patents cover ornamental designs. Patents for utility are the best option to protect the owner from copycats and other competitors. Frequently, utility patents are issued to improve or modify existing inventions. Utility patents also cover enhancements and modifications to existing inventions. A process patent will be a way to describe the actions or methods of performing a particular act. A chemical composition would include a combination of components.
What’s the duration of a patent last? While utility patents last up to 20 years from their earliest filing, they are able to be extended through delays at the Patent Office.
Do you want to protect your idea? Patents are granted only for applicants who are first to file You must file quickly – call PatentPC to speak with a patent attorney PatentPC to file your invention today!
When you are writing an application for patents when you are writing a patent application, it is important to conduct an internet search for patents, since it will provide you with some insight into other people’s ideas. This allows you to limit the extent of your idea. Furthermore, you’ll learn about state of the technology in your field of innovation. This will allow you to understand the scope of your invention and help prepare you to file your patent application.
How to Search for Patents
Patent searches are the initial step in obtaining your patent. You can do a google patent search or do a USPTO search. Patent-pending refers to the product that has been protected by the patent application. You can search the public pair to find the patent application. After the patent office approves your application, you will be able to do an examination of the patent number to find the patent issued. The product you are selling will become a patentable. You can also utilize the USPTO search engine. Read on for more details. A patent lawyer or attorney can assist you with the process. Patents in the United States are granted by the US trademark and patent office, or the United States patent office and trademark office. This office also evaluates trademark applications.
Interested in finding more similar patents? Here are the steps:
1. Brainstorm terms that describe your invention, based on the purpose, composition and use.
Write down a concise detailed description of the invention. Don’t use generic terms like “device”, “process,” or “system”. Look for synonyms to the terms you chose initially. Also, keep track of important technical terms as well as key words.
To help you identify terms and keywords, you can use the following questions.
- What is the objective of the invention? Is it a utilitarian device or an ornamental design?
- Is invention a way to make something or carry out a function? Are you referring to an item?
- What is the nature and purpose of the invention? What is the physical makeup of the invention?
- What is the goal of the invention
- What are the technical terms and terms used to describe an invention’s nature? To assist you in finding the correct terms, consult an online dictionary of technical terms.
2. Utilize these terms to find relevant Cooperative Patent Classifications on the Classification Text Search Tool. If you’re not able to locate the appropriate classification to describe your invention, scan through the class Schemas (class schedules). Think about substituting the words you use to describe your invention if you fail to receive any results from your Classification Text Search with synonyms like the ones you used in step 1.
3. Review the CPC Classification Definition for the CPC Classification Definition to confirm the validity of the CPC classification that you have located. The link to the CPC classification definition will be available when the classification you have selected has a blue box that includes “D” on the left. CPC classification definitions will assist you in determining the classification’s scope of application so that you can select the one that is most appropriate. They may also provide research tips or other suggestions that can be useful for further research.
4. Get patent documents using the CPC classification from the Patents Full-Text and Image Database. You can search and narrow down the relevant patent publications noting first the abstract and the drawings that are representative.
5. Utilize this list of most relevant patent publications to study each in detail for similarities to your invention. Be aware of the specification and claims. Contact the applicant as well as the patent examiner for additional patents.
6. You can retrieve the patent application that has been published and match the CPC classification you picked in Step 3. You may also employ the same strategy of searching you utilized in step 4 to limit down your search results to the most relevant patent applications by looking over the abstracts and drawings for each page. Next, examine the patent applications that have been published carefully, paying special attention to the claims and other drawings.
7. Find additional US patents using keywords searching in PatFT or AppFT databases, classification search of non-U.S. patents as per below, and searching for non-patent literature disclosures of inventions using internet search engines. Here are some examples:
- Add keywords to your search. Keyword searches may turn up documents that are not well-categorized or have missed classifications during Step 2. For example, US patent examiners often supplement their classification searches with keyword searches. Think about the use of technical engineering terminology rather than everyday words.
- Search for foreign patents using the CPC classification. Then, re-run the search using international patent office search engines such as Espacenet, the European Patent Office’s worldwide patent publication database of over 130 million patent publications. Other national databases include:
- European Patent Office (EPO) provides esp@cenet to access a network of Europe’s patent databases with access to machine translation of European patents.
- Japan Patent Office (JPO) – with access to machine translations of Japanese patents.
- World Intellectual Property Organization (WIPO) offers PATENTSCOPE with a full-text search of published international patent applications and machine translations for some documents, as well as a list of international patent databases.
- Korean Intellectual Property Rights Information Service (KIPRIS)
- State Intellectual Property Office (SIPO) with machine translation of Chinese patents.
- Other International Intellectual Property Offices with online patent databases include Australia, Canada, Denmark, Finland, France, Germany, Great Britain, India, Israel, Netherlands, Norway, Sweden, Switzerland, and Taiwan.
- Search non-patent literature. Inventions can be made public in many non-patent publications. It is recommended that you search journals, books, websites, technical catalogs, conference proceedings, and other print and electronic publications.
To review your search, you can hire a registered patent attorney to assist. A preliminary search will help one better prepare to talk about their invention and other related inventions with a professional patent attorney. In addition, the attorney will not spend too much time or money on patenting basics.