- 标题
- 摘要
- 关键词
- 实验方案
- 产品
-
A Model for an Interconnected Photovoltaic System using an Off-grid Inverter as a Reference Node in Island Mode
摘要: This paper presents a new feature extraction algorithm called power normalized Cepstral coefficients (PNCC) that is motivated by auditory processing. Major new features of PNCC processing include the use of a power-law nonlinearity that replaces the traditional log nonlinearity used in MFCC coefficients, a noise-suppression algorithm based on asymmetric filtering that suppresses background excitation, and a module that accomplishes temporal masking. We also propose the use of medium-time power analysis in which environmental parameters are estimated over a longer duration than is commonly used for speech, as well as frequency smoothing. Experimental results demonstrate that PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for speech in the presence of various types of additive noise and in reverberant environments, with only slightly greater computational cost than conventional MFCC processing, and without degrading the recognition accuracy that is observed while training and testing using clean speech. PNCC processing also provides better recognition accuracy in noisy environments than techniques such as vector Taylor series (VTS) and the ETSI advanced front end (AFE) while requiring much less computation. We describe an implementation of PNCC using “online processing” that does not require future knowledge of the input.
关键词: modulation filtering,feature extraction,rate-level curve,on-line speech processing,Robust speech recognition,temporal masking,asymmetric filtering,spectral weight smoothing,power function,physiological modeling,medium-time power estimation
更新于2025-09-19 17:13:59
-
Non-covalent Methods of Engineering Optical Sensors Based on Single-Walled Carbon Nanotubes
摘要: Optical sensors based on single-walled carbon nanotubes (SWCNTs) demonstrate tradeoffs that limit their use in in vivo and in vitro environments. Sensor characteristics are primarily governed by the non-covalent wrapping used to suspend the hydrophobic SWCNTs in aqueous solutions, and we herein review the advantages and disadvantages of several of these different wrappings. Sensors based on surfactant wrappings can show enhanced quantum efficiency, high stability, scalability, and diminished selectivity. Conversely, sensors based on synthetic and bio-polymer wrappings tend to show lower quantum efficiency, stability, and scalability, while demonstrating improved selectivity. Major efforts have focused on optimizing sensors based on DNA wrappings, which have intermediate properties that can be improved through synthetic modifications. Although SWCNT sensors have, to date, been mainly engineered using empirical approaches, herein we highlight alternative techniques based on iterative screening that offer a more guided approach to tuning sensor properties. These more rational techniques can yield new combinations that incorporate the advantages of the diverse nanotube wrappings available to create high performance optical sensors.
关键词: optical biosensing,non-covalent solubilization,selectivity,molecular recognition,near-infrared sensors,single-walled carbon nanotubes (SWCNTs or SWNTs),fluorescence brightness
更新于2025-09-19 17:13:59
-
[IEEE 2019 IEEE 8th International Conference on Advanced Optoelectronics and Lasers (CAOL) - Sozopol, Bulgaria (2019.9.6-2019.9.8)] 2019 IEEE 8th International Conference on Advanced Optoelectronics and Lasers (CAOL) - Theoretical and Experimental Investigation of the Optical Properties of Nano nickel Films
摘要: Features for speech emotion recognition are usually dominated by the spectral magnitude information while they ignore the use of the phase spectrum because of the difficulty of properly interpreting it. Motivated by recent successes of phase-based features for speech processing, this paper investigates the effectiveness of phase information for whispered speech emotion recognition. We select two types of phase-based features (i. e., modified group delay features and all-pole group delay features), both which have shown wide applicability to all sorts of different speech analysis and are now studied in whispered speech emotion recognition. When exploiting these features, we propose a new speech emotion recognition framework, employing outer product in combination with power and L2 normalization. The according technique encodes any variable length sequence of the phase-based features into a fixed dimension vector regardless of the length of the input sequence. The resulting representation is fed to train a classification model with a linear kernel classifier. Experimental results on the Geneva Whispered Emotion Corpus database, including normal and whispered phonation, demonstrate the effectiveness of the proposed method when compared with other modern systems. It is also shown that, combining phase information with magnitude information could significantly improve performance over the common systems solely adopting magnitude information.
关键词: whispered speech emotion recognition,Phase-based features,outer product
更新于2025-09-19 17:13:59
-
[IEEE 2019 IEEE Conference on Electrical Insulation and Dielectric Phenomena (CEIDP) - Richland, WA, USA (2019.10.20-2019.10.23)] 2019 IEEE Conference on Electrical Insulation and Dielectric Phenomena (CEIDP) - The Anti-Interference Method of Michelson Optical Fiber Interferometer for GIS Partial Discharge Ultrasonic Detection
摘要: Performance of automatic speech recognition (ASR) systems can significantly be improved by integrating further sources of information such as additional modalities, or acoustic channels, or acoustic models. Given the arising problem of information fusion, striking parallels to problems in digital communications are exhibited, where the discovery of the turbo codes by Berrou et al. was a groundbreaking innovation. In this paper, we show ways how to successfully apply the turbo principle to the domain of ASR and thereby provide solutions to the above-mentioned information fusion problem. The contribution of our work is fourfold: First, we review the turbo decoding forward-backward algorithm (FBA), giving detailed insights into turbo ASR, and providing a new interpretation and formulation of the so-called extrinsic information being passed between the recognizers. Second, we present a real-time capable turbo-decoding Viterbi algorithm suitable for practical information fusion and recognition tasks. Then we present simulation results for a multimodal example of information fusion. Finally, we prove the suitability of both our turbo FBA and turbo Viterbi algorithm also for a single-channel multimodel recognition task obtained by using two acoustic feature extraction methods. On a small vocabulary task (challenging, since spelling is included), our proposed turbo ASR approach outperforms even the best reference system on average over all SNR conditions and investigated noise types by a relative word error rate (WER) reduction of 22.4% (audio-visual task) and 18.2% (audio-only task), respectively.
关键词: hidden Markov models,Speech recognition,multimedia systems,robustness,iterative decoding
更新于2025-09-19 17:13:59
-
[IEEE 2020 National Conference on Communications (NCC) - Kharagpur, India (2020.2.21-2020.2.23)] 2020 National Conference on Communications (NCC) - Simultaneous Measurement of Atmospheric Turbulence Induced Intensity and Polarization Fluctuation for Free Space Optical Communication
摘要: For the problem of action detection, most existing methods require that relevant portions of the action of interest in training videos have been manually annotated with bounding boxes. Some recent works tried to avoid tedious manual annotation, and proposed to automatically identify the relevant portions in training videos. However, these methods only concerned the identification in either spatial or temporal domain, and may get irrelevant contents from another domain. These irrelevant contents are usually undesirable in the training phase, which will lead to a degradation of the detection performance. This paper advances prior work by proposing a joint learning framework to simultaneously identify the spatial and temporal extents of the action of interest in training videos. To get pixel-level localization results, our method uses dense trajectories extracted from videos as local features to represent actions. We first present a trajectory split-and-merge algorithm to segment a video into the background and several separated foreground moving objects. In this algorithm, the inherent temporal smoothness of human actions is exploited to facilitate segmentation. Then, with the latent SVM framework on segmentation results, spatial and temporal extents of the action of interest are treated as latent variables that are inferred simultaneously with action recognition. Experiments on two challenging datasets show that action detection with our learned spatial and temporal extents is superior than state-of-the-art methods.
关键词: discriminative latent variable model,split-and-merge,action recognition,Action localization
更新于2025-09-19 17:13:59
-
[IEEE 2019 IEEE 46th Photovoltaic Specialists Conference (PVSC) - Chicago, IL, USA (2019.6.16-2019.6.21)] 2019 IEEE 46th Photovoltaic Specialists Conference (PVSC) - Designing of CZTSSe Based SnS Thin Film Solar Cell for Improved Conversion Efficiency: A Simulation Study with SCAPS
摘要: This paper proposes a new computational method for retrieving shapes under unpredictable conditions such as when occlusion, geometric distortion, and differences in image resolution occur simultaneously. The human visual system retrieves shapes from incomplete information in the real world, and it has inspired a lot of computational methods of retrieving shapes. In order to retrieve shapes, the observed shapes are decided to be alike or unlike remembered shapes in memory after the comparison of these shapes. To compare the observed and remembered shapes, they must first be appropriately represented so that the points on each shape can be mapped and compared. For this reason, the shape retrieval process needs an appropriate shape representation and shape mapping methods. Moreover, the shape representations should be normalized before the mapping process. However, a normalization process for representations under unpredictable conditions has not yet been established. In this paper, we describe a shape retrieval method that enables us to retrieve shapes under unpredictable conditions with a suitable normalization process. Using curvature partition and angle-length profile, our shape retrieval method normalizes the shape representation before it does the mapping. As a result, unlike the previously proposed methods, it can be used under unpredictable conditions such as when occlusion, geometric distortion, and differences in image resolution occur simultaneously.
关键词: curvature partition,Shape recognition,geometric parameter,occlusion,shape retrieval
更新于2025-09-19 17:13:59
-
Separation Between Coal and Gangue Based on Infrared Radiation and Visual Extraction of the YCbCr Color Space
摘要: Distinguishing between coal and gangue in the production lines of mining factories based on the thermal energy and infrared radiation emission of an object is feasible. In this paper, we use an infrared camera (IC) to distinguish between coal and gangue in the industrial mining field. Additionally, this system is considered to be a binary classification system that has two classes. We analyze the infrared images of coal and gangue; then extract the appropriate texture features from the infrared images in order to develop an accurate classification system by using support vector machine (SVM). The method applied in this work essentially depends on feature extraction of images. The statistical features based on gray level information (GLI), grey-level cooccurrence matrix (GLCM) and visual features are executed. Thus, we suggest preparation steps to obtain one select feature before importing the data into the SVM classifier, and this approach is adopted as the fundamental basis for our work. We exploit only one feature of the infrared image, namely, Cb, which is extracted from the YCbCr color space, and then compute the mean value of Cb after heating and capturing the photos for the coal and gangue samples. The proposed method achieves a high classification accuracy 97.83 % by using Gaussian-SVM.
关键词: YCbCr,SVM,emissive power,gangue recognition,infrared camera application,Industrial mining
更新于2025-09-19 17:13:59
-
[IEEE 2018 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS) - Ishigaki, Okinawa, Japan (2018.11.27-2018.11.30)] 2018 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS) - A Single LED Multi-reception Indoor Visible Light Position System Based on Power Estimation-Angle Algorithm
摘要: Care issues and costs associated with an increasing elderly population are becoming a major concern for many countries. The use of assistive robots in “smart-home” environments has been suggested as a possible partial solution to these concerns. A challenge is the personalization of the robot to meet the changing needs of the elderly person over time. One approach is to allow the elderly person, or their carers or relatives, to make the robot learn activities in the smart home and teach it to carry out behaviors in response to these activities. The overriding premise being that such teaching is both intuitive and “nontechnical.” To evaluate these issues, a commercially available autonomous robot has been deployed in a fully sensorized but otherwise ordinary suburban house. We describe the design approach to the teaching, learning, robot, and smart home systems as an integrated unit and present results from an evaluation of the teaching component with 20 participants and a preliminary evaluation of the learning component with three participants in a human–robot interaction experiment. Participants reported ?ndings using a system usability scale and ad-hoc Likert questionnaires. Results indicated that participants thought that this approach to robot personalization was easy to use, useful, and that they would be capable of using it in real-life situations both for themselves and for others.
关键词: robot personalization,robot learning,Activity recognition,robot teaching,robot companion
更新于2025-09-19 17:13:59
-
[IEEE 2019 PhotonIcs & Electromagnetics Research Symposium - Spring (PIERS-Spring) - Rome, Italy (2019.6.17-2019.6.20)] 2019 PhotonIcs & Electromagnetics Research Symposium - Spring (PIERS-Spring) - Degenerate Energy Exchange between Optical TE <sub/>2</sub> -modes of the Planar Waveguide Based on a Thin Left-handed Film and a Nonlinear Substrate
摘要: Face recognition (FR) systems in real-world applications need to deal with a wide range of interferences, such as occlusions and disguises in face images. Compared with other forms of interferences such as nonuniform illumination and pose changes, face with occlusions has not attracted enough attention yet. A novel approach, coined dynamic image-to-class warping (DICW), is proposed in this work to deal with this challenge in FR. The face consists of the forehead, eyes, nose, mouth, and chin in a natural order and this order does not change despite occlusions. Thus, a face image is partitioned into patches, which are then concatenated in the raster scan order to form an ordered sequence. Considering this order information, DICW computes the image-to-class distance between a query face and those of an enrolled subject by finding the optimal alignment between the query sequence and all sequences of that subject along both the time dimension and within-class dimension. Unlike most existing methods, our method is able to deal with occlusions which exist in both gallery and probe images. Extensive experiments on public face databases with various types of occlusions have confirmed the effectiveness of the proposed method.
关键词: image-to-class distance,Face recognition,biometrics,dynamic time warping,occlusion
更新于2025-09-19 17:13:59
-
Integrated Optical Fiber Force Myography Sensor as Pervasive Predictor of Hand Postures
摘要: Force myography (FMG) is an appealing alternative to traditional electromyography in biomedical applications, mainly due to its simpler signal pattern and immunity to electrical interference. Most FMG sensors, however, send data to a computer for further processing, which reduces the user mobility and, thus, the chances for practical application. In this sense, this work proposes to remodel a typical optical fiber FMG sensor with smaller portable components. Moreover, all data acquisition and processing routines were migrated to a Raspberry Pi 3 Model B microprocessor, ensuring the comfort of use and portability. The sensor was successfully demonstrated for 2 input channels and 9 postures classification with an average precision and accuracy of ~99.5% and ~99.8%, respectively, using a feedforward artificial neural network of 2 hidden layers and a competitive output layer.
关键词: user interface,optical fiber sensor,Force myography,gesture recognition,integrated sensor
更新于2025-09-19 17:13:59