版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
1、<p><b> 附 錄A 英文文獻(xiàn)</b></p><p> Speaker Recognition</p><p> By Judith A. Markowitz, J. Markowitz Consultants</p><p> Speaker recognition uses features of a person
2、’s voice to identify or verify that person. It is a well-established biometric with commercial systems that are more than 10 years old and deployed non-commercial systems that are more than 20 years old. This paper descr
3、ibes how speaker recognition systems work and how they are used in applications.</p><p> 1. Introduction</p><p> Speaker recognition (also called voice ID and voice biometrics) is the only hum
4、an-biometric technology in commercial use today that extracts information from sound patterns. It is also one of the most well-established biometrics, with deployed commercial applications that are more than 10 years old
5、 and non-commercial systems that are more than 20 years old.</p><p> 2. How do Speaker-Recognition Systems Work</p><p> Speaker-recognition systems use features of a person’s voice and speakin
6、g style to:</p><p> attach an identity to the voice of an unknown speaker</p><p> verify that a person is who she/ he claims to be</p><p> separate one person’s voice from other
7、voices in a multi-speaker environment</p><p> The first operation is called speak identification or speaker recognition; the second has many names, including speaker verification, speaker authentication, vo
8、ice verification, and voice recognition; the third is speaker separation or, in some situations, speaker classification. This papers focuses on speaker verification, the most highly commercialized of these technologies.&
9、lt;/p><p> 2.1 Overview of the Process</p><p> Speaker verification is a biometric technology used for determining whether the person is who she or he claims to be. It should not be confused with
10、 speech recognition, a non-biometric technology used for identifying what a person is saying. Speech recognition products are not designed to determine who is speaking.</p><p> Speaker verification begins w
11、ith a claim of identity (see Figure A1). Usually, the claim entails manual entry of a personal identification number (PIN), but a growing number of products allow spoken entry of the PIN and use speech recognition to ide
12、ntify the numeric code. Some applications replace manual or spoken PIN entry with bank cards, smartcards, or the number of the telephone being used. PINS are also eliminated when a speaker-verification system contacts th
13、e user, an approach typical of </p><p> Figure A1.</p><p> Once the identity claim has been made, the system retrieves the stored voice sample (called a voiceprint) for the claimed identity an
14、d requests spoken input from the person making the claim. Usually, the requested input is a password. The newly input speech is compared with the stored voiceprint and the results of that comparison are measured against
15、an acceptance/rejection threshold. Finally, the system accepts the speaker as the authorized user, rejects the speaker as an impostor, or takes ano</p><p> If the verification is successful the system may u
16、pdate the acoustic information in the stored voiceprint. This process is called adaptation. Adaptation is an unobtrusive solution for keeping voiceprints current and is used by many commercial speaker verification system
17、s.</p><p> 2.2 The Speech Sample</p><p> As with all biometrics, before verification (or identification) can be performed the person must provide a sample of speech (called enrolment). The sam
18、ple is used to create the stored voiceprint.</p><p> Systems differ in the type and amount of speech needed for enrolment and verification. The basic divisions among these systems are</p><p>
19、text dependent</p><p> text independent</p><p> text prompted</p><p> 2.2.1 Text Dependent</p><p> Most commercial systems are text dependent. Text-dependent system
20、s expect the speaker to say a pre-determined phrase, password, or ID. By controlling the words that are spoken the system can look for a close match with the stored voiceprint. Typically, each person selects a private pa
21、ssword, although some administrators prefer to assign passwords. Passwords offer extra security, requiring an impostor to know the correct PIN and password and to have a matching voice. Some systems further enhance </
22、p><p> A global phrase may also be used. In its 1996 pilot of speaker verification Chase Manhattan Bank used ‘Verification by Chemical Bank’. Global phrases avoid the problem of forgotten passwords, but lack t
23、he added protection offered by private passwords.</p><p> 2.2.2 Text Independent</p><p> Text-independent systems ask the person to talk. What the person says is different every time. It is ex
24、tremely difficult to accurately compare utterances that are totally different from each other - particularly in noisy environments or over poor telephone connections. Consequently, commercial deployment of text-independe
25、nt verification has been limited.</p><p> 2.2.3 Text Prompted</p><p> Text-prompted systems (also called challenge response) ask speakers to repeat one or more randomly selected numbers or wor
26、ds (e.g. “43516”, “27,46”, or “Friday, computer”). Text prompting adds time to enrolment and verification, but it enhances security against tape recordings. Since the items to be repeated cannot be predicted, it is extre
27、mely difficult to play a recording. Furthermore, there is no problem of forgetting a password, even though the PIN, if used, may still be forgotten.</p><p> 2.3 Anti-speaker Modelling</p><p>
28、Most systems compare the new speech sample with the stored voiceprint for the claimed identity. Other systems also compare the newly input speech with the voices of other people. Such techniques are called anti-speaker m
29、odelling. The underlying philosophy of anti-speaker modelling is that under any conditions a voice sample from a particular speaker will be more like other samples from that person than voice samples from other speakers.
30、 If, for example, the speaker is using a bad telephone conne</p><p> The most common anti-speaker techniques are</p><p> discriminate training</p><p> cohort modeling</p>
31、<p> world models</p><p> Discriminate training builds the comparisons into the voiceprint of the new speaker using the voices of the other speakers in the system. Cohort modelling selects a small set
32、 of speakers whose voices are similar to that of the person being enrolled. Cohorts are, for example, always the same sex as the speaker. When the speaker attempts verification, the incoming speech is compared with his/h
33、er stored voiceprint and with the voiceprints of each of the cohort speakers. World models (also called ba</p><p> 2.4 Physical and Behavioural Biometrics</p><p> Speaker recognition is often
34、characterized as a behavioural biometric. This description is set in contrast with physical biometrics, such as fingerprinting and iris scanning. Unfortunately, its classification as a behavioural biometric promotes the
35、misunderstanding that speaker recognition is entirely (or almost entirely) behavioural. If that were the case, good mimics would have no difficulty defeating speaker-recognition systems. Early studies determined this was
36、 not the case and identified mi</p><p> The physical/behavioural classification also implies that performance of physical biometrics is not heavily influenced by behaviour. This misconception has led to the
37、 design of biometric systems that are unnecessarily vulnerable to careless and resistant users. This is unfortunate because it has delayed good human-factors design for those biometrics.</p><p> 3. How is S
38、peaker Verification Used?</p><p> Speaker verification is well-established as a means of providing biometric-based security for:</p><p> telephone networks</p><p> site access<
39、;/p><p> data and data networks</p><p> and monitoring of:</p><p> criminal offenders in community release programmes</p><p> outbound calls by incarcerated felons<
40、/p><p> time and attendance</p><p> 3.1 Telephone Networks</p><p> Toll fraud (theft of long-distance telephone services) is a growing problem that costs telecommunications services
41、 providers, government, and private industry US$3-5 billion annually in the United States alone. The major types of toll fraud include the following:</p><p> Hacking CPE</p><p> Calling card f
42、raud</p><p> Call forwarding</p><p> Prisoner toll fraud</p><p> Hacking 800 numbers</p><p> Call sell operations</p><p> 900 number fraud</p>
43、<p> Switch/network hits</p><p> Social engineering</p><p> Subscriber fraud</p><p> Cloning wireless telephones</p><p> Among the most damaging are theft of
44、services from customer premises equipment (CPE), such as PBXs, and cloning of wireless telephones. Cloning involves stealing the ID of a telephone and programming other phones with it. Subscriber fraud, a growing problem
45、 in Europe, involves enrolling for services, usually under an alias, with no intention of paying for them.</p><p> Speaker verification has two features that make it ideal for telephone and telephone networ
46、k security: it uses voice input and it is not bound to proprietary hardware. Unlike most other biometrics that need specialized input devices, speaker verification operates with standard wireline and/or wireless telephon
47、es over existing telephone networks. Reliance on input devices created by other manufacturers for a purpose other than speaker verification also means that speaker verification cannot expec</p><p> Applicat
48、ions of speaker verification on wireline networks include secure calling cards, interactive voice response (IVR) systems, and integration with security for proprietary network systems. Such applications have been deploye
49、d by organizations as diverse as the University of Maryland, the Department of Foreign Affairs and International Trade Canada, and AMOCO. Wireless applications focus on preventing cloning but are being extended to subscr
50、iber fraud. The European Union is also actively appl</p><p> 3.2 Site access</p><p> The first deployment of speaker verification more than 20 years ago was for site access control. Since then
51、, speaker verification has been used to control access to office buildings, factories, laboratories, bank vaults, homes, pharmacy departments in hospitals, and even access to the US and Canada. Since April 1997, the US D
52、epartment of Immigration and Naturalization (INS) and other US and Canadian agencies have been using speaker verification to control after-hours border crossings at the Scob</p><p> 3.3 Data and Data Networ
53、ks</p><p> Growing threats of unauthorized penetration of computing networks, concerns about security of the Internet, and increases in off-site employees with data access needs have produced an upsurge in
54、the application of speaker verification to data and network security.</p><p> The financial services industry has been a leader in using speaker verification to protect proprietary data networks, electronic
55、 funds transfer between banks, access to customer accounts for telephone banking, and employee access to sensitive financial information. The Illinois Department of Revenue, for example, uses speaker verification to allo
56、w secure access to tax data by its off-site auditors.</p><p> 3.4 Corrections</p><p> In 1993, there were 4.8 million adults under correctional supervision in the United States and that number
57、 continues to increase. Community release programmes, such as parole and home detention, are the fastest growing segments of this industry. It is no longer possible for corrections officers to provide adequate monitoring
58、 of those people.</p><p> In the US, corrections agencies have turned to electronic monitoring systems. Since the late 1980s speaker verification has been one of those electronic monitoring tools. Today, se
59、veral products are used by corrections agencies, including an alcohol breathalyzer with speaker verification for people convicted of driving while intoxicated and a system that calls offenders on home detention at random
60、 times during the day. </p><p> Speaker verification also controls telephone calls made by incarcerated felons. Inmates place a lot of calls. In 1994, US telecommunications services providers made $1.5 bill
61、ion on outbound calls from inmates. Most inmates have restrictions on whom they can call. Speaker verification ensures that an inmate is not using another inmate’s PIN to make a forbidden contact.</p><p> 3
62、.5 Time and Attendance</p><p> Time and attendance applications are a small but growing segment of the speaker-verification market. SOC Credit Union in Michigan has used speaker verification for time and at
63、tendance monitoring of part-time employees for several years. Like many others, SOC Credit Union first deployed speaker verification for security and later extended it to time and attendance monitoring for part-time empl
64、oyees.</p><p> 4. Standards</p><p> This paper concludes with a short discussion of application programming interface (API) standards. An API contains the function calls that enable programmer
65、s to use speaker-verification to create a product or application. Until April 1997, when the Speaker Verification API (SVAPI) standard was introduced, all available APIs for biometric products were proprietary. SVAPI rem
66、ains the only API standard covering a specific biometric. It is now being incorporated into proposed generic biometric API s</p><p> Why is it important to support API standards? Developers using a product
67、with a proprietary API face difficult choices if the vendor of that product goes out of business, fails to support its product, or does not keep pace with technological advances. One of those choices is to rebuild the ap
68、plication from scratch using a different product. Given the same events, developers using a SVAPI-compliant product can select another compliant vendor and need perform far fewer modifications. Consequently,</p>&
69、lt;p><b> 附 錄B 中文翻譯</b></p><p><b> 說話人識(shí)別</b></p><p> 作者:Judith A. Markowitz, J. Markowitz Consultants</p><p> 說話人識(shí)別是用一個(gè)人的語音特征來辨認(rèn)或確認(rèn)這個(gè)人。有著10多年的商業(yè)系統(tǒng)和超過20
70、年的非商業(yè)系統(tǒng)部署,它是一種行之有效的生物測定學(xué)。本文介紹了說話人識(shí)別系統(tǒng)的工作原理,以及它們在應(yīng)用軟件中如何被使用。</p><p><b> 1. 介紹</b></p><p> 說話人識(shí)別(也叫語音身份和語音生物測定學(xué))是當(dāng)今從聲音模式提取信息的商業(yè)應(yīng)用中唯一的人類生物特征識(shí)別技術(shù)。有著10多年的商業(yè)應(yīng)用程序部署和超過20年的非商業(yè)系統(tǒng),它也是最行之有效的生
71、物測定學(xué)之一。</p><p> 2. 說話人識(shí)別系統(tǒng)如何工作</p><p> 說話人識(shí)別系統(tǒng)使用一個(gè)人的語音和說話風(fēng)格來達(dá)到以下目的:</p><p> 為一個(gè)未知說話人的聲音綁定一個(gè)身份</p><p> 確認(rèn)一個(gè)人是他/她所宣稱的</p><p> 在多說話人的環(huán)境中從其它的聲音中區(qū)分出每一特定人的聲
72、音</p><p> 第一個(gè)操作被稱為說話人辨認(rèn)或說話人識(shí)別;第二個(gè)有許多名字,包括說話人確認(rèn),說話人鑒定,聲音確認(rèn)和聲音識(shí)別;第三個(gè)是說話人分離,某些情形下也叫說話人分類。本文著重這些技術(shù)中最高度商業(yè)化的說話人確認(rèn)。</p><p><b> 2.1 方法概覽</b></p><p> 說話人確認(rèn)是決定一個(gè)人是否是他或她所宣稱身份的一種
73、生物測定技術(shù)。它不應(yīng)同語音識(shí)別相混淆。后者是一種用來確定一個(gè)人說什么的非生物測定技術(shù)。語音識(shí)別產(chǎn)品不是被設(shè)計(jì)用來確定誰在發(fā)言的。</p><p> 說話人確認(rèn)以一個(gè)身份聲明開始(見圖B1)。通常情況下,聲明需要手工輸入個(gè)人識(shí)別碼( PIN ) 但越來越多的產(chǎn)品允許發(fā)言輸入密碼并使用語音識(shí)別確定數(shù)字代碼。一些應(yīng)用程序用銀行卡,智能卡,或使用中的電話號(hào)碼取代個(gè)人識(shí)別碼的手動(dòng)或語音輸入。當(dāng)一個(gè)說話人確認(rèn)系統(tǒng)聯(lián)系用戶時(shí)
74、,個(gè)人識(shí)別碼也會(huì)被取消,一個(gè)典型的這種系統(tǒng)被用來監(jiān)測在家服刑的罪犯。</p><p><b> 圖B1</b></p><p> 一旦身份聲明被做出,系統(tǒng)會(huì)取回聲明身份的存儲(chǔ)語音樣本(叫做聲紋)并要求聲明用戶的語音輸入。通常,要求的輸入是一個(gè)密碼。最新輸入的語音同存儲(chǔ)的聲紋相比較,比較的結(jié)果用一個(gè)接受/拒絕的閾值進(jìn)行衡量。最終,系統(tǒng)接受說話人為授權(quán)用戶,或拒絕說話
75、人為冒名頂替者,或做出應(yīng)用程序定義的其它動(dòng)作。一些系統(tǒng)報(bào)告一個(gè)可信度或其它評(píng)分來說明它的決定的可信程度。</p><p> 如果確認(rèn)成功,系統(tǒng)可能升級(jí)存儲(chǔ)聲紋的聲學(xué)信息。這個(gè)過程叫做適應(yīng)。適應(yīng)是用來保持聲紋正確性的一種穩(wěn)妥的解決方案。它在許多商用說話人確認(rèn)系統(tǒng)中被使用。</p><p><b> 2.2 語音樣本</b></p><p>
76、 同所有的生物認(rèn)證一樣,在確認(rèn)(或辨認(rèn))可以被執(zhí)行之前,一個(gè)語音樣本必須被提供(這個(gè)過程也叫做登記)。這個(gè)樣本被用來生成存儲(chǔ)聲紋。</p><p> 在需要登記和確認(rèn)的語音類型和數(shù)量方面,系統(tǒng)之間有區(qū)別。這些系統(tǒng)的基本分類是:</p><p><b> 文本相關(guān)</b></p><p><b> 文本無關(guān)</b>&l
77、t;/p><p><b> 文本提示型</b></p><p> 2.2.1 文本相關(guān)</p><p> 大部分的商業(yè)系統(tǒng)都是文本相關(guān)的。文本相關(guān)的系統(tǒng)期待用戶說出事先定義好的詞組、密碼或者標(biāo)識(shí)符。通過對被說出單詞的控制,系統(tǒng)可以從存儲(chǔ)的聲紋中找出最為匹配的一個(gè)。一個(gè)典型的例子,每個(gè)用戶可以選擇一個(gè)私有的密碼,盡管一些管理員更喜歡分配密碼。因
78、為冒名頂替者需要同時(shí)知道正確的個(gè)人身份號(hào)碼和密碼并且還要擁有一個(gè)相匹配的聲音,所以密碼提供了額外的安全性。有些系統(tǒng)通過不存儲(chǔ)密碼的人類可讀性信息來進(jìn)一步提高安全性。</p><p> 通用短語也可以被使用。在1996年的說話人確認(rèn)試驗(yàn)中,大通曼哈頓銀行使用了“化學(xué)銀行確認(rèn)”。通用短語避免了忘記密碼的問題,但是缺乏私有密碼所提供的額外保護(hù)。</p><p> 2.2.2 文本無關(guān)<
79、/p><p> 文本無關(guān)的系統(tǒng)要求用戶說話。該用戶每次說的內(nèi)容是不同的。精確的匹配完全不同的語音是非常困難的,尤其是在高噪音環(huán)境下或者非常差的電話連接中。因此,文本無關(guān)確認(rèn)的商業(yè)化部署受到限制。</p><p> 2.2.3 文本提示型</p><p> 文本提示系統(tǒng)(也叫做口令應(yīng)答)要求說話人重復(fù)一個(gè)或多個(gè)隨機(jī)選擇的數(shù)字或單詞(例如“43516”、“27、46”
80、或者“星期五、計(jì)算機(jī)”)。文本提示增加了登記和確認(rèn)的時(shí)間,但是它提高了針對磁帶錄音的安全性。由于重述的條目不能被預(yù)測到,播放錄音是非常困難的。此外,這里沒有忘記密碼的問題。即使是使用個(gè)人身份號(hào)碼,它也可能被遺忘掉。</p><p> 2.3 反說話人模型</p><p> 大部分系統(tǒng)把新的語音樣本同要求身份的存儲(chǔ)聲紋進(jìn)行比較。另一些系統(tǒng)也把最近輸入的語音同其它人的聲音相比較。這種技術(shù)被
81、叫做反說話人模型。反說話人模型的基本原理是在任何條件下,來自某一特定說話人的語音樣本比起其它說話人的語音樣本總是更像這個(gè)說話人的其它樣本。例如,如果說話人使用一個(gè)差的電話連接并且這個(gè)說話人的聲紋匹配也很差,很有可能同期組群(或世界模型)的得分會(huì)更差。</p><p> 最常見的反說話人技術(shù)有:</p><p><b> 區(qū)別訓(xùn)練</b></p>&l
82、t;p><b> 同期組群模型</b></p><p><b> 世界模型</b></p><p> 區(qū)別訓(xùn)練在系統(tǒng)中建立了使用其它說話人聲音的新說話人的聲紋對照。同期組群模型挑選少數(shù)說話人。他們的聲音與已登記人類似。例如,同期組群通常是相同性別的說話人。當(dāng)說話人試圖確認(rèn)時(shí),進(jìn)入的語音與他/她的聲紋及其每一個(gè)同期組群說話人的聲紋進(jìn)行比
83、較。世界模型(又稱背景模式或復(fù)合模式) 包含一個(gè)語音的橫截面斷片。同一個(gè)世界模型被用于所有的說話人。</p><p> 2.4 物理和行為生物測定學(xué)</p><p> 說話人識(shí)別通常表現(xiàn)為行為生物測定學(xué)的特征。這樣的描述是設(shè)定在與物理生物測定學(xué)的對照中的,例如指紋,虹膜掃描。不幸的是,其作為行為生物測定學(xué)的分類促進(jìn)了說話人識(shí)別被認(rèn)為是完全(或者幾乎完全)是行為性的誤解。如果是那樣的話,
84、好的模仿者會(huì)毫無困難地?fù)魯≌f話人識(shí)別系統(tǒng)。早期的研究決定了事實(shí)并非如此。它們確定了模仿抵抗因素。這些因素反映了說話人發(fā)音器官(叫做聲道)的大小和形狀。</p><p> 物理/行為的分類也暗示了物理生物測定學(xué)的性能不會(huì)受到很強(qiáng)的行為影響。這種誤解曾導(dǎo)致不必要地易受粗心、有抵抗力的用戶攻擊的生物測定系統(tǒng)的設(shè)計(jì)。這是不幸的,因?yàn)樗泳徚擞糜谀切┥餃y定的好的人性因素設(shè)計(jì)。</p><p>
85、 3 說話人確認(rèn)如何使用</p><p> 說話人確認(rèn)是一種行之有效的生物型安全手段。它常用于:</p><p><b> 電話網(wǎng)絡(luò)</b></p><p><b> 站點(diǎn)訪問</b></p><p><b> 數(shù)據(jù)和數(shù)據(jù)網(wǎng)絡(luò)</b></p><p&
86、gt; 此外,也用于以下情況的監(jiān)測:</p><p><b> 罪犯的社區(qū)釋放方案</b></p><p><b> 在押重犯的外撥電話</b></p><p><b> 時(shí)間和出勤</b></p><p><b> 3.1 電話網(wǎng)絡(luò)</b>&l
87、t;/p><p> 收費(fèi)欺詐(盜用長途電話服務(wù))是一個(gè)日益嚴(yán)重的問題,僅在美國它每年花費(fèi)電訊服務(wù)供應(yīng)商、政府與私營行業(yè)3-5億美元。主要的收費(fèi)欺詐類型包括以下幾種:</p><p><b> 黑客終端</b></p><p><b> 電話卡詐騙</b></p><p><b> 呼叫
88、促進(jìn)</b></p><p><b> 囚犯收費(fèi)欺詐</b></p><p><b> 黑客800號(hào)碼</b></p><p><b> 電話業(yè)務(wù)出售</b></p><p><b> 900號(hào)碼欺騙</b></p>&l
89、t;p><b> 交換機(jī)/網(wǎng)絡(luò)攻擊</b></p><p><b> 社會(huì)操縱</b></p><p><b> 欺詐訂戶</b></p><p><b> 克隆無線電話</b></p><p> 其中最具破壞性的是客戶端設(shè)備服務(wù)盜取,例如
90、專用分組交換機(jī)和無線電話的克隆??寺“娫捥?hào)碼的盜取并用它編程其它話機(jī)。在歐洲,訂戶欺詐是一個(gè)日益嚴(yán)重的問題。它涉及通常化名的服務(wù)登記,使用者無意為化名支付費(fèi)用。</p><p> 說話人確認(rèn)有兩個(gè)特征使它非常適用于電話和電話網(wǎng)絡(luò)安全:它使用輸入的語音而無需進(jìn)入私人的硬件。不像其它的生物測定需要特殊的輸入設(shè)備,說話人識(shí)別可以在現(xiàn)有電話網(wǎng)絡(luò)上的有線及/或無線電話上運(yùn)轉(zhuǎn)。輸入設(shè)備制造廠商的目的不是說話人確認(rèn)。依靠
91、他們制造的輸入設(shè)備意味著不能指望依靠一個(gè)專有輸入裝置來獲得穩(wěn)固性和質(zhì)量。說話人識(shí)別必須克服輸入設(shè)備和語音頻率處理方式上的困難??勺冃允怯刹煌木W(wǎng)絡(luò)類型(例如有線和無線),線路和環(huán)境中不可預(yù)知的噪音水平,傳輸不一致以及電話聽筒的麥克風(fēng)不同所引起。這種可變性的靈敏度可以通過類似語音增強(qiáng)和噪音模型的技術(shù)來減弱,但是產(chǎn)品仍舊需要在期待的使用環(huán)境下進(jìn)行測試。</p><p> 有線網(wǎng)絡(luò)上的說話人識(shí)別應(yīng)用包括安全呼叫卡,互
92、動(dòng)聲訊系統(tǒng)和專有網(wǎng)絡(luò)體系的安全整合。這種應(yīng)用已經(jīng)在馬里蘭大學(xué)、加拿大外交商貿(mào)部和美國石油公司多種組織中配置起來。無線應(yīng)用的重點(diǎn)在于防止克隆,但目前正擴(kuò)展至訂戶欺詐。歐盟也積極地將說話人識(shí)別運(yùn)用到各種項(xiàng)目的電話業(yè)務(wù)中,其中包括銀行和電信業(yè)的呼叫者確認(rèn)系統(tǒng),COST250系統(tǒng)和畢加索系統(tǒng)。</p><p><b> 3.2站點(diǎn)訪問</b></p><p> 20多年前
93、,第一個(gè)說話人確認(rèn)系統(tǒng)的部署是用于站點(diǎn)訪問控制的。從那時(shí)起,說話人確認(rèn)已經(jīng)被用于辦公樓、工廠、實(shí)驗(yàn)室、銀行保險(xiǎn)箱、住宅、醫(yī)院藥劑部門,甚至進(jìn)入美國和加拿大的訪問控制。從1997年起,美國移民局(INS)和其他美國和加拿大的機(jī)構(gòu)已經(jīng)使用說話人確認(rèn)來控制斯克比下班后的邊境口岸和蒙大拿州的入境港。美國移民局正在其它入境港的通勤線測試一個(gè)說話人確認(rèn)和人臉識(shí)別的聯(lián)合系統(tǒng)。</p><p> 3.3 數(shù)據(jù)和數(shù)據(jù)網(wǎng)絡(luò)<
94、/p><p> 日益增長的涉及到互聯(lián)網(wǎng)安全的未經(jīng)授權(quán)的計(jì)算機(jī)網(wǎng)絡(luò)滲透威脅和有數(shù)據(jù)訪問需求的場外雇員的增加已經(jīng)導(dǎo)致數(shù)據(jù)和網(wǎng)絡(luò)安全的說話人確認(rèn)應(yīng)用的高潮。</p><p> 在使用說話人確認(rèn)保護(hù)專有數(shù)據(jù)網(wǎng)絡(luò),電子資金銀行轉(zhuǎn)帳,電話銀行的客戶賬戶訪問和雇員訪問敏感金融信息方面,金融服務(wù)行業(yè)一直處于領(lǐng)先地位。例如,伊利諾斯州稅務(wù)部使用說話人確認(rèn)允許它的場外審計(jì)員對稅務(wù)數(shù)據(jù)進(jìn)行安全訪問。</p
95、><p><b> 3.4 懲教</b></p><p> 1993年,美國共有480萬成年人處于懲教監(jiān)管之下并且這個(gè)數(shù)字仍在繼續(xù)增加。社區(qū)釋放方案,如假釋和家庭拘留,是這個(gè)行業(yè)增長最快的部分。獄警為這些人提供足夠的監(jiān)控已經(jīng)不再可能了。</p><p> 在美國,懲教機(jī)構(gòu)已經(jīng)轉(zhuǎn)向電子監(jiān)控系統(tǒng)。從二十世紀(jì)八十年代后期開始,說話人確認(rèn)已經(jīng)成為那些
96、電子監(jiān)控工具中的一種。今天,包括用于酒后駕駛者的說話人確認(rèn)酒精檢測器和一個(gè)在白天隨機(jī)時(shí)間呼叫家庭拘留罪犯的系統(tǒng)在內(nèi)的好幾種產(chǎn)品被懲教機(jī)構(gòu)使用。</p><p> 說話人確認(rèn)也可以控制在押重犯的電話呼叫。囚犯的地方有很多電話。1994年,美國電信服務(wù)供應(yīng)商從監(jiān)獄對外呼叫中得到15億美元。大部分犯人在呼叫對象上有限制。說話人確認(rèn)確保犯人不使用另一位同室者的個(gè)人身份號(hào)碼獲得禁止的聯(lián)系。</p><
97、p><b> 3.5 時(shí)間和出勤</b></p><p> 時(shí)間和出勤應(yīng)用是說話人確認(rèn)市場中很小但持續(xù)增長的一部分。密歇根州的SOC信用合作社已經(jīng)將說話人確認(rèn)用于兼職員工的時(shí)間和出勤監(jiān)測好幾年了。和其它機(jī)構(gòu)一樣,SOC信用合作社首先配置說話人識(shí)別用于安全,之后擴(kuò)展到兼職員工的時(shí)間和出勤監(jiān)測。</p><p><b> 4. 標(biāo)準(zhǔn)</b>
98、;</p><p> 本文用簡短的討論總結(jié)應(yīng)用程序接口(API)標(biāo)準(zhǔn)。一個(gè)應(yīng)用程序接口包含一個(gè)函數(shù),程序員能夠調(diào)用它生成一個(gè)說話人確定的產(chǎn)品或應(yīng)用。直到1997年四月,說話人識(shí)別應(yīng)用程序接口(SVAPI)才被提出。在此之前,所有可得到的生物測定產(chǎn)品的應(yīng)用程序接口都是私有的。說話人識(shí)別應(yīng)用程序接口仍舊是唯一涵蓋一種具體生物測定學(xué)的應(yīng)用程序接口標(biāo)準(zhǔn)。它現(xiàn)已被納入擬議的生物測定學(xué)通用應(yīng)用程序接口標(biāo)準(zhǔn)。說話人識(shí)別應(yīng)用程
99、序接口是由一個(gè)跨部門的說話人識(shí)別供應(yīng)商,顧問和終端用戶組織發(fā)展的,用于解決一系列的需求并支持一系列廣泛的產(chǎn)品特色。因?yàn)榧戎С指邔哟蔚墓δ埽ㄈ绾艚械怯洠┯种С值蛯哟蔚墓δ埽ㄈ邕x擇音頻輸入特點(diǎn)),它有利于被新手和有經(jīng)驗(yàn)的開發(fā)商發(fā)展出不同類型的應(yīng)用。</p><p> 為什么支持應(yīng)用程序接口標(biāo)準(zhǔn)非常重要呢?如果產(chǎn)品廠商生意倒閉、不再支持該產(chǎn)品或者沒有跟上技術(shù)進(jìn)步,那么使用私有應(yīng)用程序接口產(chǎn)品的開發(fā)者就會(huì)面臨困難的選擇
100、。其中的一個(gè)選擇就是使用不同的產(chǎn)品從零開始重建應(yīng)用程序。相同的情況下,使用兼容說話人識(shí)別應(yīng)用程序接口產(chǎn)品的開發(fā)人員可以選擇另一個(gè)兼容廠商,從而只需要作少得多的修改。因此,說話人識(shí)別應(yīng)用程序接口使說話人確認(rèn)的開發(fā)更小風(fēng)險(xiǎn)并更少費(fèi)用。通用生物測定應(yīng)用程序接口標(biāo)準(zhǔn)的出現(xiàn)使說話人確認(rèn)同其它生物測定學(xué)的結(jié)合更加便利。所以的這些都是對說話人確認(rèn)廠商有利的因?yàn)榕囵B(yǎng)了市場的增長。歸根究柢,開發(fā)者和廠商對應(yīng)用程序接口標(biāo)準(zhǔn)的積極支持有益于每一個(gè)人。<
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 眾賞文庫僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 說話人識(shí)別算法研究與說話人辨認(rèn)系統(tǒng)實(shí)現(xiàn).pdf
- 基于GMM說話人分類的說話人識(shí)別系統(tǒng)研究.pdf
- 復(fù)雜信道下的說話人識(shí)別.pdf
- 耳語音說話人識(shí)別的研究.pdf
- 自動(dòng)說話人識(shí)別技術(shù)的研究.pdf
- 說話人識(shí)別研究及DSP實(shí)現(xiàn).pdf
- 說話人識(shí)別方法的研究.pdf
- 說話人識(shí)別中的信道補(bǔ)償.pdf
- 說話人識(shí)別系統(tǒng)研究.pdf
- 說話人識(shí)別魯棒性研究.pdf
- 多說話人識(shí)別技術(shù)研究.pdf
- 基于MFCC說話人識(shí)別算法研究.pdf
- 基于EMD的說話人識(shí)別研究.pdf
- 說話人識(shí)別的前端處理研究.pdf
- 融合說話人識(shí)別和人臉識(shí)別的身份認(rèn)證.pdf
- 說話人識(shí)別是根據(jù)語音中反映說話人生理和行
- 基于遷移PLDA的說話人識(shí)別研究.pdf
- 基于VQ與GMM的說話人識(shí)別.pdf
- 說話人識(shí)別算法研究及SOPC設(shè)計(jì).pdf
- 噪聲環(huán)境下說話人識(shí)別算法研究.pdf
評(píng)論
0/150
提交評(píng)論