### 國立交通大學

### 電機與控制工程學系

### 碩士論文

使用於即時心率估計系統之記憶體式 FFT 設計

متلللين

Design of a Memory-based FFT for Real-Time Heart Rate Estimation System

研究生:藍啟仁

指導教授:蔡尚澕 博士

中華民國一百年五月

### 使用於即時心率估計系統之記憶體式 FFT 設計

### **Design of a Memory-based FFT for Real-Time Heart**

#### **Rate Estimation System**

研究生:藍啟仁Student : Chi-Jen Lan指導教授:蔡尚澕 博士Advisor : Dr. Shang-Ho Tsai

#### 國立交通大學

電機與控制工程學系

#### 碩士論文

A Thesis Submitted to department of Electrical and Control Engineering College of Electrical Engineering National Chiao Tung University in Partial Fulfillment of the Requirements for the Degree of Master of Science in Electrical and Control Engineering

May 2011 Hsinchu, Taiwan, Republic of China



### 使用於即時心率估計系統之記憶體式 FFT 設計

研究生:藍啟仁 指導教授:蔡尚澕 博士

國立交通大學電機與控制工程學系

#### 摘要

本文論提出一個用於穿戴式紡織電極的即時心率估計系統的記憶體式快速 傅利葉轉換,心電圖訊號容易受到電極與身體間接觸鬆弛和走動的影響。也就是 基線電位干擾。我們使用子空間追蹤技術消除基線電位的干擾,接加入絕對值運 算來符合後面心率估計系統所需。心率估計系統使用自我相關度的方法來找出 R-R 間距並計算出心率。為了能達到即時的需求,我們使用藉由 Power Method 而得秩為一的適應性子空間追蹤技術。在心率估計方面,我們用 FFT 跟 IFFT 來 實現自我相關度的計算。

在硬體實現方面,我們使用 FPGA 來做訊號處理與心率估測的部份。在提出 的 FFT 部份,由於心率估計系統是一個工作頻率不快的系統,因此我們採用記憶 體式的架構來減少面積和功率的消耗。設計的結果經過編譯並下載至 Altera EP2C35F672 Cyclone II FPGA 裝置上驗證。

關鍵字:記憶體式FFT、心電圖、穿戴紡織性電極、基線飄移、心率估計、 FPGA 實現

# Design of a Memory-based FFT for Real-Time Heart Rate Estimation System

Student : Chi-Jen Lan

Advisor : Dr. Shang-Ho Tsai

Institute of Electrical and Control Engineering National Chiao-Tung University

#### Abstract

This thesis presents a memory-based fast Fourier transform (FFT) for a real time heart rate estimation system for wearable textile sensors. The electrocardiogram (ECG) is seriously affected by loose contact between biopotential electrodes and skin, and body movement. Also, it has baseline wander interference. For the baseline wander interference, we use the subspace approach to remove the baseline wander component, then uses the absolute operation for heart rate estimate. The method of heart rate estimation applies the correlation technique to find the R-R interval and estimates the heart rate. For the requirement of real time, we use rank one adaptive algorithm to find dominant eigenvalue by the power method. For the heart rate estimator, we use FFT and IFFT to achieve the correlation technique.

In the hardware implementation, we use FPGA implementation to achieve the signal processing and heart rate estimation. In the proposed FFT, we adopt memory-based architecture to reduce area and power dissipation since heart rate estimation system is a low frequency system. The resulting design are compiled and download to the Altera EP2C35F672 Cyclone II FPGA device for real-time verification.

Keywords: Memory-based FFT, Electrocardiogram, wearable textile electrodes, baseline wander, heart rate estimation, FPGA implementation

#### 誌謝

在短短两年的碩士班研究生涯中,我要特別感謝我的指導教授蔡尚澕老師 和鄭木火老師在學術研究與待人處事上的教導,由於老師的不厭其煩的教誨, 指引了我進行研究時應有的態度與方法,故在本論文付梓之際,對於辛勤傳道並 耐心授業的老師致上最誠摯的謝意。在口試期間,承蒙口試委員林源倍教授、簡 鳳村教授撥冗指導並提供許多寶貴的意見,使此論文得以更臻完善,在此也誠摯 地感謝您們的辛勞。

感謝實驗室陳立中學長、洪英哲學長、梁嘉明學長、溫宏揚學長、楊凱鈞學 長在我研究過程中對我不厭其煩的指導與建議,感謝實驗室的夥伴建男、明華、 銘釗陪我一同作研究,沒有你們實驗室將會缺少活力與歡樂,一起買便當、一起 熬夜寫報告的生活,讓我體會了作為一個研究生的點滴滋味,由於你們的耐心陪 伴、鼓勵打氣,讓我能在辛苦的研究生活中,依然可以心情愉悅。在研究期間受 到很多朋友與師長的幫助和鼓勵,才能讓我在低潮時期不放棄,每當挫折時,就 會想起你們的話,你們的加油打氣,每次都讓我覺得很窩心。

最後,我要感謝我的家人,溫暖和諧的家庭讓我可以無後顧之憂的汲取知識, 專心致力於研究之上,因為你們長久以來的支持鼓勵和包容體諒,使我能順利完 成學業,並做好面對下一個人生階段挑戰的準備。另外也感謝女友雅俐一路上的 支持與陪伴。最後再次謝謝,週遭關心我的師長、家人及朋友,願將這份喜悅與 你們分享。



# Contents

| List of Tables ii |                    |                             |    |  |  |  |  |  |  |  |
|-------------------|--------------------|-----------------------------|----|--|--|--|--|--|--|--|
| Lis               | List of Figures iv |                             |    |  |  |  |  |  |  |  |
| 1                 | Intro              | oduction                    | 1  |  |  |  |  |  |  |  |
|                   | 1.1                | Introduction                | 1  |  |  |  |  |  |  |  |
|                   | 1.2                | Literature Survey           | 2  |  |  |  |  |  |  |  |
|                   | 1.3                | Motivation                  | 4  |  |  |  |  |  |  |  |
|                   | 1.4                | Thesis Overview             | 4  |  |  |  |  |  |  |  |
| 2                 | Elec               | trocardiogram Background    | 5  |  |  |  |  |  |  |  |
|                   | 2.1                | The Electrocardiogram (ECG) | 5  |  |  |  |  |  |  |  |
|                   | 2.2                | Biopotential Electrodes     | 8  |  |  |  |  |  |  |  |
| 3                 | Base               | line Wander Removal         | 11 |  |  |  |  |  |  |  |
|                   | 3.1                | Introduction                | 11 |  |  |  |  |  |  |  |
|                   | 3.2                | The Power Method            | 11 |  |  |  |  |  |  |  |
|                   | 3.3                | Baseline Wander Removal     | 13 |  |  |  |  |  |  |  |
|                   | 3.4                | Experiments                 | 14 |  |  |  |  |  |  |  |
| 4                 | Hear               | rt Rate Estimation          | 16 |  |  |  |  |  |  |  |
|                   | 4.1                | Introduction                | 16 |  |  |  |  |  |  |  |
|                   | 4.2                | Correlation Method          | 17 |  |  |  |  |  |  |  |
|                   | 4.3                | Experiments                 | 19 |  |  |  |  |  |  |  |

| 5 | Har | dware Implementation 2.                |
|---|-----|----------------------------------------|
|   | 5.1 | Analog Circuit Design                  |
|   |     | 5.1.1 Instrumentation Amplifier        |
|   |     | 5.1.2 Notch Filter                     |
|   |     | 5.1.3 Lowpass Filter                   |
|   |     | 5.1.4 Highpass Filter                  |
|   |     | 5.1.5 Output Stage Amplifier           |
|   | 5.2 | Analog-to-Digital Converter            |
|   | 5.3 | FPGA System                            |
|   | 5.4 | Baseline Wander Removal                |
|   | 5.5 | Fast Fourier Transform Algorithm    34 |
|   | 5.6 | FFT Architecture for Implementation    |
|   |     | 5.6.1 Memory-based Architecture 39     |
|   |     | 5.6.2 Pipeline Architecture 4          |
|   | 5.7 | Heart Rate Estimator                   |
|   | 5.8 | Experimental and Result                |
| 6 | Con | clusion and Future Work 52             |

#### Bibliography

53

# **List of Tables**

| 5.1 | Pin functions of input . | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | • | 30 |
|-----|--------------------------|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|----|
| 5.2 | Pin functions of output  |   | • |   |   |   |   |   |   |   |   |   |   | • |   |   |   |   | • |   | • |   |   |   |   | • |   |   | 30 |



# **List of Figures**

| 1.1 | The Filter-bank [14]                                                     | 3  |
|-----|--------------------------------------------------------------------------|----|
| 2.1 | Representative electrical activity from various regions of the heart     | 6  |
| 2.2 | The components of the ECG waveform                                       | 7  |
| 2.3 | The foam-pad electrode                                                   | 9  |
| 2.4 | The steel textile electrode                                              | 9  |
| 3.1 | (a) Input signal $i(n)$ (b) The relationship between $z(n)$ and $z(n-1)$ | 13 |
| 3.2 | Top: the original ECG signals. Middle: the baseline wandering compo-     |    |
|     | nent. Buttom: the result without baseline                                | 15 |
| 4.1 | Self-correlated                                                          | 17 |
| 4.2 | Correlation values with baseline wander                                  | 20 |
| 4.3 | ECG data in frequency domain                                             | 21 |
| 4.4 | Correlation values by FFT and IFFT                                       | 21 |
| 4.5 | The heart rate estimation using correlation approach                     | 22 |
| 5.1 | The integral structure of the real time heart rate estimation system     | 23 |
| 5.2 | Basic analog circuits for signal preprocessing                           | 24 |
| 5.3 | Notch filter                                                             | 25 |
| 5.4 | Frequency response of the designed notch filter                          | 26 |
| 5.5 | Second order Sallen-Key lowpass filter                                   | 26 |
| 5.6 | Frequency response of the designed lowpass filter                        | 27 |
| 5.7 | Second order Sallen-Key highpass filter                                  | 27 |
| 5.8 | Frequency response of the designed highpass filter                       | 28 |

| 5.9  | Output stage amplifier                                                | 28 |
|------|-----------------------------------------------------------------------|----|
| 5.10 | Analog-to digital conter, LTC1282                                     | 29 |
| 5.11 | Timing diagram of LTC1282                                             | 31 |
| 5.12 | FPGA board                                                            | 32 |
| 5.13 | Block diagram of baseline wander removal                              | 33 |
| 5.14 | Finite state machine of baseline wander removal                       | 34 |
| 5.15 | (a) Diagram of 8-point decimation-in-time FFT (b) Decimation-in-time  |    |
|      | butterfly                                                             | 36 |
| 5.16 | (a) Diagram of 8-point decimation-in-frequency FFT (b) Decimation-in- |    |
|      | frequency butterfly                                                   | 37 |
| 5.17 | (a) Radix-4 decimation-in-frequency butterfly (b) Diagram of 16-point |    |
|      | radix-4 DIF FFT                                                       | 40 |
| 5.18 | Block diagram of memory-based architecture                            | 41 |
| 5.19 | Radix-2 DIF SDF                                                       | 42 |
| 5.20 | SDF operation modes (a) mode1 (b) mode2                               | 42 |
| 5.21 | Radix-2 DIF MDC                                                       | 42 |
| 5.22 | The worst case of situation                                           | 43 |
| 5.23 | Proposed FFT block diagram                                            | 44 |
| 5.24 | State flow chart of the FFT module                                    | 44 |
| 5.25 | Group reverse between input and output                                | 45 |
| 5.26 | The symmetry property of twiddle factor                               | 48 |
| 5.27 | RAM address assignment of FFT                                         | 49 |
| 5.28 | ROM address assignment of FFT                                         | 49 |
| 5.29 | Brief system specifications                                           | 50 |
| 5.30 | Compile report of the entire system                                   | 51 |
| 5.31 | Results display on the FPGA device                                    | 51 |

# Chapter 1

# Introduction

#### **1.1 Introduction**

One of the important physiological signals is electrocardiogram (ECG) that is used to diagnosing various cardiac diseases such as hypercalcemia and arrhythmia. These diseases is not obvious in initial symptoms. It is used to record the ECG signal for 24 hours to observe the waveform. However, the ECG medical devices needs more comfortable electrodes for long-term monitoring. Thus, we focus on wearable textile sensors used in heart rate estimation system [1]. The wearable textile sensors are comfortable, washable and durable. However unlike the metal-plate electrodes, textile electrode are dry contact that affect the ECG signals in measured by respiration, electrodes and body movement. The heart rate estimation includes the signal processing algorithm and heart rate estimator since the heart rate and the heart rate variability is an important physiological information [2]. For the baseline wander induced by texitile electrode, we use the adaptive subspace algorithm [11] to remove the baseline wander component. Taking the absolute value for heart rate estimate. In the heart rate estimator, we use fast Fourier transform (FFT) for computing the correlation value and estimate the heart rate. The results are emulated by Altera EP2C35F672 Cyclone II FPGA.

### **1.2 Literature Survey**

The wearable sensor made of textiles and for the heart rate have been popular recently [3]– [5]. [3] is tele-home healthcare that uses the wearable medical clothes and bluetooth receiving terminal for showing and storing the ECG data. Activity patient-monitor [4] for monitoring belt. The multisensor harness system with acoustic and optical heart rate sensors, and temperature sensors[5].

In order to reduce the noise from motion artifacts (stading, walking, and running), many wearable sensors are sewn on clothes or harness [5] to avoid the friction or the loose contact between electrodes and skin. The ECG data evaluate and disply on terminal like PC, and use the software like matlab or LabVIEW to estimate the heart rate. Thus, the distance between system and terminal cannot be too far.

In QRS detection algorithms, almost algorithms use a filter stage to enhance the QRS complex and attenuate other component since the frequency range of QRS complex is about 10 to 25 Hz. There are many methods for detecting the QRS complex [12], such as Derivatives-based, Wavelet-based algorithm, Filter-bank, and Mathematical Morphology are classic QRS detection algorithms.

The Derivative-based detects the characteristic steep slope of QRS complex [6]. It is similar to the high-pass filter. The basic Derivative-based algorithm equation is

$$y_1(n) = x(n+1) - x(n-1)$$
(1.1)

The algorithm is simple but it degrades antinoise performance.

The Filter-bank method [14] contains a set of filter to separate the input signal into various subbands as shown in Fig 1.1. Filter-bank does not affect the time delay and phase distortions, but it is hard to design a low order, high performance filter.

The Wavelet transform (WT) [7] is used for singularity detection. The WT method is similar to the Filter-bank that can decompose the ECG signal and enhance the QRS complex. The coefficient of low-pass and high-pass are approximations and details. We can design the wavelet filter bank by proper approximations and details coefficients.

The basic operations of Mathematical Morphology [8] are the terms erosion and dila-



tion, as below

Dilation: 
$$f \oplus g(n) = \max(f(n+i) + g(i))$$
 (1.2)

Erosion: 
$$f\Theta g(n) = \min(f(n+i) - g(i))$$
 (1.3)

Opening: 
$$f \circ g(n) = f\Theta g(\oplus g)(n)$$
 (1.4)

Closing: 
$$f * g(n) = f \oplus g(\Theta g)(n)$$
 (1.5)

where f(n) is a signal that  $1 \le n \le N$  and the structure element is g(n) that  $1 \le n \le M$ and M < N. An erosion followed by a dilation, called Opening. A dilation followed by an erosion, called Closing. According the opening and closing operation, we can detect the peak and valley. Thus, it can be used for detecting the QRS complex. The disadvantage of Mathematical Morphology is mass time requirement.

#### **1.3** Motivation

Due to the compaction of life, low exercise, stay up late, and unhealthy habits of eating, many people are accompanied by chronic illness. The medical device for detecting the symptoms of disease have been popular. For the convenient and high quality life, a comfortable and wearable system is important [3]. The wearable system of ECG monitoring should be not only comfortable, but also should be have ability of long-term and real-time monitoring. In order to provide a real-time, small size, and low power ECG monitoring device, we try to find some ways that can reduce hardware complexity. Therefore we adopt the memory-based architecture FFT for estimating heart rate.

### 1.4 Thesis Overview

Chapter 2 reviews the principle of electrocardiogram (ECG) and introduces the properties of biopotential electrodes. Chapter 3 uses rank one adaptive subspace technique for removing the baseline wander component and simulation result by matlab. In Chapter 4, we introduce the heart rate estimator and simulate the heart rate estimation by correlation technique. In Chapter 5, the FPGA implementation of whole heart rate estimation system is demonstrated. The last chapter is the conclusion and future work.

## **Chapter 2**

## **Electrocardiogram Background**

There are many signals in the body, the bioelectrical phenomena includes electrocardiogram (ECG), electromyogram (EMG), and electroencephalogram (EEG), electroneurogram (ENG), and electroretinogram (ERG). They can be converted from energy to the electrical signals by sensor and there are valuable medical information for symptom of disease and for diagnosis such as arrhythmia, ventricular hypertrophy, and atriomegaly. In this thesis, we develop the estimator of heart rate. We introduce the basic of the electrocaridiogram, then discussing why we choose the textile electrodes.

### 2.1 The Electrocardiogram (ECG)

The electrocardiogram (ECG) signal is produced by heart and the heart is one of the important organ in the body. ECG records the property of electrical conductivity of the heart and can be used to detect the heart diseases by analysis the graph. Investigating the R-R interval can verify the autonomic neuroregulation system. The cycle of ECG has three phase, the polarization phase, the depolarization phase, and the repolarization phase. The polarization phase, also called the ready phase that has a resting potential of approximately -85 mV [9]. During the depolarization phase, the muscle cell has steady resting potential across the cell membrane, lessening this charge towards zero and leads to cardiac constriction. The repolarization phase is also called recovery phase and diastole that is between depolarization and polarization. Before the depolarization phase, it



Fig. 2.1: Representative electrical activity from various regions of the heart

must returns to polarization, the period between depolarization and polarization is called repolarization. The action phase in Fig. 2.1 represents the temporal action potential from each cellular of the heart. The heart has several conductive tissues, such as atrial and ventricular muscles, sinoatrial(SA) and atrioventricular(AV) nodes, and Purkinje fibers. Each excitable tissue has its own electrical activity potential. These action potential can be measure by biopotential electrodes on body interface.

A normal ECG in a period of time, as shown in Fig. 2.2 consist of P, T waves, QRS complex, PR interval, ST segment and QT interval, described as follows:

• P wave

The P wave is the first upward wave of the ECG and represents left and right atrial contraction (depolarization). The electrical impulse originate in the sinoatrial (SA) node and through the atrial musculature. The P wave should be upright and the duration of P wave should not be more than 0.12 seconds. The atrial problem may cause abnormal P wave.

• QRS complex

The QRS complex is the most important wave in ECG signal. It represents the left and right ventricular muscles depolarization. The normal duration of QRS complex interval is 0.05 to 0.10 seconds. If the duration is more than 0.12 seconds, it indicates abnormal intraventricular conduction and ventricular arrhythmia.



Fig. 2.2: The components of the ECG waveform

• T wave

The T wave represents the recovery (repolarization) of the ventricles, usually concerned the direction, the shape, and the height of the wave. The normal direction of The T wave is upright and the shape is slightly rounded and the normal height is not above 5 mm. Abnormal T wave may cause dectrolyte disturbance, like hypercalcemia and hypocalcemia.

• U wave

The U wave is a small additional wave which follows the T wave, it does not always appear in the ECG. The diabetes, digoxin and slow repolarization of the ventricular may cause the abnormal tall U wave.

• PR interval

The PR interval is measured from the start of the P wave to the start of the QRS complex. The duration of the normal PR interval is 0.12 to 0.20 seconds.

• PR segment and ST segment

The PR segment and ST segment are related to isoelectric line. The PR segment represents the time taken to conduct through the slow AV junction and the ST segment signifies the period of early deporization of the ventricles.

• QT interval

July 1

The QT interval is measured from the start of QRS complex to the end of the T wave. The interval represents a complete ventricular cycle of depolarization and repolarization. It should be less than half of the R-R interval. A long QT interval wider than 1/2 of the R-R interval indicates there is a risk for developing hemodynamically unstable dysrhythmias and higher incidence of sudden death.

### 2.2 Biopotential Electrodes

It is necessary to provide some interface between the electronic measuring circuit and the body for measuring and recording potentials in the body, called biopotential electrodes. The biopotential electrodes should have the ability of transmitting the current across the interface between the body and the measuring circuit. The electrode must serve as a transducer to change an ionic current into an electronic current since current is carried in the body by ions. There are two types of electrodes, internal (invasive) and body-surface (noninvasive) electrodes.

• metal-plate electrodes



Fig. 2.3: The foam-pad electrode

In recent years, the noninvasive examination has been popular because it's not need to hurt or inbreak the body. There are various types of the electrodes such as metalplate, suction, floating and flexible electrodes, one of the most frequently used form of measuring the bioelectric signal is the metal-plate electrodes. Fig. 2.3 is one of the metal-plate electrodes. This style of the electrode is disk-shaped. When the electrodes apply to the patient, the area of skin should be clean. we generally use electrolyte gel which contains Cl<sup>-</sup> to the patient to maintain good contact. However, patients' skin may be over-sensitive with the Ag/AgCl electrodes or uncomfortable for long-term monitorting.

• steel textile electrode



Fig. 2.4: The steel textile electrode

In order to increase the comfort of patients' feeling, the development of fabric-based electrodes is becoming more important, in this thesis, we measure ECG signals using textile electrode as shown in Fig. 2.4 to estimate the heart rate in real time.

the sensor not only soft and comfortable to the wearing person, but also washable and reusable for economical consideration.



# Chapter 3

# **Baseline Wander Removal**

### 3.1 Introduction

The ECG signals we measured from the body by wearable electrodes of steel textiles have many interference. The interference is primarily produced by the loose contact with skins of electrodes, respiration, and body movement. These interference are called baseline wander that can affect the accurate diagnosis of cardiac diseases. However, it is important to provide a stable signal and clearly ECG signal characteristic for heart rate estimation and reliable diagnosis. There are many methods to find out how to remove baseline drift and they have been proposed in twenty years. Usually we use wavelet transform to decompose the signals and eliminate the baseline wander component. In this thesis, we adopt adaptive subspace baseline wander removal to achieve lower computation; this method can be realized in a real-time manner. First, we introduce the power method. Then we use the power method for removing baseline wandering

#### **3.2 The Power Method**

The power method [10] is a way for finding eigenvalue that has the largest in absolute value in matrix **A**. At first, we suppose matrix **A** is an  $n \times n$  diagonalizable that has a dominant eigenvalue. Then we choose an initial nonzero vector  $x_0$  in  $\mathbb{R}^n$ . The iteration is described as follows:

for :k = 1, 2, ...  

$$\mathbf{z}_k = \mathbf{A}\mathbf{x}_{k-1}$$
  
 $\mathbf{x}_k = \mathbf{z}_k / \|\mathbf{z}_k\|$   
 $\lambda_k = [\mathbf{x}_k]^H \mathbf{A}\mathbf{x}_k$   
end

Assume the eigenvalues of **A** are  $\lambda_1, \lambda_2, \dots, \lambda_n$ , and  $|\lambda_1| > |\lambda_2| \ge \dots \ge |\lambda_n|$ .  $\lambda_1$  is called the dominant eigenvalue of A. Let us explain why the power iteration converges and the initial vector  $\mathbf{x}_0$  can be approximated to a dominant eigenvector. Let

$$\mathbf{x}_0 = c_1 \mathbf{y}_1 + c_2 \mathbf{y}_2 + \dots + c_n \mathbf{y}_n \tag{3.1}$$

where  $c_1, c_2, \dots, c_n$  are scalars, and  $y_1, y_2, \dots, y_n$  are *n* linear independent eigenvectors in A. Multiplying both sides of (3.1) by  $\mathbf{A}^k$ , we have

$$\mathbf{A}^{k}\mathbf{x}_{0} = \mathbf{A}^{k}(c_{1}\mathbf{y}_{1} + c_{2}\mathbf{y}_{2} + \cdots + c_{n}\mathbf{y}_{n})$$

$$= c_{1}\mathbf{A}^{k}\mathbf{y}_{1} + c_{2}\mathbf{A}^{k}\mathbf{y}_{2} + \cdots + c_{n}\mathbf{A}^{k}\mathbf{y}_{n}$$

$$= c_{1}\lambda_{1}^{k}\mathbf{y}_{1} + c_{2}\lambda_{2}^{k}\mathbf{y}_{2} + \cdots + c_{n}\lambda_{n}^{k}\mathbf{y}_{n}$$

$$= \lambda_{1}^{k}[c_{1}\mathbf{y}_{1} + c_{2}(\frac{\lambda_{2}}{\lambda_{1}})^{k}\mathbf{y}_{2} + \cdots + c_{n}(\frac{\lambda_{n}}{\lambda_{1}})^{k}\mathbf{y}_{n}]$$
(3.2)

We assume that  $|\lambda_1|$  is lager than other absoluted eigenvalues in, it follows

$$\left|\frac{\lambda_2}{\lambda_1}\right| < \left|\frac{\lambda_3}{\lambda_1}\right| < \dots < \left|\frac{\lambda_n}{\lambda_1}\right| < 1$$

Therefore

$$(\frac{\lambda_i}{\lambda_1})^k \approx 0, \quad 2 \le i \le n$$

where k is sufficiently large. Then the power method in (3.2) converges and the approximation is given by

$$\mathbf{A}^k \mathbf{x}_0 \approx c_1 \lambda_1^k \boldsymbol{y}_1 \tag{3.3}$$

The approximation in (3.3) improves when k increases. Any scalar multiple of  $y_1$  is also a dominant eigenvactor because  $y_1$  is a dominant eigenvector. Thus  $\mathbf{A}^k \mathbf{x}_0$  converges and approaches a multiple of the dominant eigenvector of matrix  $\mathbf{A}$ .

#### 3.3 Baseline Wander Removal

We decompose the ECG signal into various subspaces. that is, the ECG signal can be a linear combination of various subspaces. The baseline wander component is the principle subspace that spanned by dominant eigenvalue since the baseline wander has the largest energy in ECG signal. We use the power method for tracking dominant eigenvector. The adaptive subspace technique that uses subspace of rank one is sufficient to find the baseline wander component. It is simple but efficient and can be realized easily.

Assume the data correlation matrix  $\Gamma(n)$  is estimated by input vector i(n). Actually,  $\Gamma(n)$  is a slowy varing matrix of time, and it updates according to the following formula

$$\Gamma(n) = \alpha \Gamma(n-1) + (1-\alpha)\boldsymbol{i}(n)\boldsymbol{i}^{T}(n)$$
(3.4)

where  $\alpha$  is  $0 < \alpha < 1$  and is very close to 1; the input vector  $i(n) = [x(n - N + 1), \dots, x(n-1), x(n)]^T$  is updated by FIFO (first-in, first-out) shown in Fig. 3.1(a). Then given an initial nonzero vector z(0), and the power iteration is given by

$$\boldsymbol{s}(n) = \boldsymbol{\Gamma}(n)\boldsymbol{z}(n-1) \tag{3.5}$$

$$\boldsymbol{z}(n) = \frac{\boldsymbol{s}(n)}{\|\boldsymbol{s}(n)\|}$$
(3.6)



Fig. 3.1: (a) Input signal i(n) (b) The relationship between z(n) and z(n-1)

Now we decompose z(n) in another way that one component is spanned by previous subspace z(n-1) and the other component  $\Delta(n)$ , which is a orthogonal subspace shown

in Fig. 3.1(b). Then

$$\boldsymbol{z}(n) = \boldsymbol{z}^{T}(n)\boldsymbol{z}(n-1)\boldsymbol{z}(n-1) + \boldsymbol{\Delta}(n)$$
(3.7)

From (3.5), we know

$$\boldsymbol{s}(n-1) = \boldsymbol{\Gamma}(n-1)\boldsymbol{z}(n-2) \tag{3.8}$$

Substituting (3.4) and (3.7) into (3.5), we obtain

$$\boldsymbol{s}(n) = [\alpha \boldsymbol{\Gamma}(n-1) + (1-\alpha)\boldsymbol{i}(n)\boldsymbol{i}^{T}(n)]\boldsymbol{z}(n-1)$$
  
$$= \alpha \boldsymbol{\Gamma}(n-1)\boldsymbol{z}(n-1) + (1-\alpha)\boldsymbol{i}(n)\boldsymbol{i}^{T}(n)\boldsymbol{z}(n-1)$$
  
$$= \alpha \boldsymbol{s}(n-1)\boldsymbol{z}^{T}(n-1)\boldsymbol{z}(n-2)$$
  
$$+ \boldsymbol{\Gamma}(n-1)\boldsymbol{\Delta}(n-1)$$
  
$$+ (1-\alpha)\boldsymbol{i}(n)\boldsymbol{i}^{T}(n)\boldsymbol{z}(n-1)$$
(3.9)

We assume  $\Delta(n-1)$  approaches 0 and hence  $\mathbf{z}(n-1)^T \mathbf{z}(n-2) \approx 1$  because the angle between  $\mathbf{z}(n-1)$  and  $\mathbf{z}(n-2)$  approaches 0. Then we have

$$\boldsymbol{s}(n) = \alpha \boldsymbol{s}(n) + (1 - \alpha) \boldsymbol{i}(n) \boldsymbol{i}^{T}(n) \boldsymbol{z}(n-1)$$
(3.10)

Finally, we obtain the dominant eigenvector and obtain the baseline wander component as

$$\boldsymbol{b}(n) = \boldsymbol{i}^T(n)\boldsymbol{z}(n)\boldsymbol{z}(n)$$
(3.11)

Then we subtract the baseline from the input signal and the output, i.e

$$\boldsymbol{y}(n) = \boldsymbol{i}(n) - \boldsymbol{b}(n) \tag{3.12}$$

#### 3.4 Experiments

In this section, we use the ECG data from Ming Young Biomedical Corporation with 8bit resolution and sampling rate of 200 samples per second. By experience, the length of vector is set to be 40 for adaptive subspace approach. The coefficent  $\alpha$  is set to be 0.99. The larger  $\alpha$  leads good converge but slower response, and smaller  $\alpha$  leads faster response but the slower convergence. Fig. 3.2 depicts the performance of baseline wander removal



Fig. 3.2: Top: the original ECG signals. Middle: the baseline wandering component. Buttom: the result without baseline

that uses adaptive subspace algorithm. The top figure shows the original ECG data about 10 seconds measured by wearable sensors, and the original signal is affected by baseline wander. The middle figure shows the baseline is estimated by adaptive subspace techique. The bottom figure is the result without influence from baseline wander.

# **Chapter 4**

# **Heart Rate Estimation**

#### 4.1 Introduction

A ECG signals with sequence waves PORST repeats for every single heartbeat. The R wave of QRS complex is the primary feature of the cycle in ECG. Therefore, the heart rate is estimated the reciprocal of the R-R interval. Since the R-R interval is not always constant, the heart rate is the average of heartbeats per unit time in a specific time window. Many methods of heart rate estimation have been proposed in thirty years [12]. In general, the method to detect the energy or frequency of QRS complex and search peaks to estimate heart rate are, for instance, such as wavelet, filter bank, and mathematical morphology.

The estimation of heart rate base on wavelet is to decompose the signal; small scales represent the high frequency component, and large scales represent the low frequency component of the signal. Thus, wavelet transform can work as a band-pass filter by given a proper scale [13].

The filter-bank approach is similar to the wavelet transform method. The frequency of QRS complex is about 3-40 Hz, the P and T waves is about 0.5-10 Hz. Then design the subbands at different scales to find the local maxima and the position [14].

For mathematical morphology method, that uses nonlinear signal processing operators to enhance the particular shape of the signals by setting operations like Dilation, Erosion, Opening and Closing [15]. Since the shape of QRS complex has sharp positive and negative peaks, using a peak-valley (PV) extractor can enhance the QRS complex and suppress the P, T waves and noise. The opening followed by closing of the signal provides one way for peak-valley extraction.

### 4.2 Correlation Method

In this section, we use the correlation approach to find the position of local maxima and evaluate the heart rate directly. Evaluating the correlation value can clearly distinguish the distance between the first R wave and the second R wave since the characteristic of ECG signals is periodic with peaks. As the shifted length  $l = 2\tau$ , the correlation value is smaller than shifted length  $l = \tau$ . Therefore, we can search simply to find the first R-R interval because the correlation value is largest. Fig. 4.1 depicts the self-correlated



Fig. 4.1: Self-correlated

concept for shifted length l, the peaks reflect the R waves and the distance of R-R interval is  $\tau$ . If l = 0, it has maximum correlation value. When  $l = \tau$ , the magnitude of correlation

value is secondary, otherwise, the correlation value should be very small. The correlation method formula is defined as below

$$c(l) = \sum_{n} y(n)y(n+l) \tag{4.1}$$

The correlation operates similar to linear convolution. If y(n) is a perfectly periodic signal, the correlation value should have peak when the signal shifts the multiple of the period. However it needs mass calculation for computing correlation value directly. Fortunately, we can use the property of the convolution and apply the Fourier Transform to reduce the computation [20]. It can be shown as follows

$$\begin{split} C(e^{j\omega}) &= \sum_{l} c(l)e^{-j\omega l} \\ &= \sum_{l} \sum_{n} y(n)y(n+l)e^{-j\omega l} \\ &= \sum_{l} \sum_{m} y(m)y(l-m)e^{-j\omega l} \\ &= \sum_{l} (\sum_{m} y(-m)e^{-j(-\omega)(-m)})y(l-m)e^{-j\omega(l-m)} \\ &= Y(e^{j\omega})Y(e^{-j\omega}) \\ &= Y(e^{j\omega})Y^*(e^{j\omega}) \quad \text{if } y(n) \text{ is real} \end{split}$$

The input signal from baseline wander removal output should be performed the absolute output values before we compute the correlation value. After removes the baseline wandering component, the signal has both positive and negative values. Thus, if we use the correlation method, it will minus the peak of the correlation value and may lead to that the correlation value is not the primary peak anymore, and misguide to evaluate the real heart rate.

Before the Fast Fourier Transform (FFT) operation, zero padding is needed in time domain since the original signal is finite length. One of the property of Discrete Fourier Transform (DFT) is that the multiplication of two sequences in the frequency domain reflects the circular convolution in the time domain. Zero padding makes the computation of the circular convolution equal to linear convolution [16]. The procedure of the correlation approach is as follows

- $\hat{y}(n)$ : Zero padding to the input signals y(n)
- $C(e^{j\omega})$ : Perform the FFT of  $\hat{y}(n)$
- Calculate  $||C(e^{j\omega})||^2$
- c(l): Perform the Inverse FFT of  $||C(e^{j\omega})||^2$

Therefore, we can simplify the correlation approach by using the FFT and the Inverse FFT (IFFT). The peak of correlation value occurs when the ECG signals shift length *l*. We obtain the R-R interval between the first R wave and the second R wave without computation. The heart rate per minute is evaluated by the following formula

Heart Rate = 
$$60 \times \frac{\text{Sampling frequency}}{\text{R-R interval}}$$
 (4.2)

However, the R-R intervals are not always constant and may be influenced by body movement. The peak of the correlation value vanishes when the periodic of ECG signals is faraway. Therefore, we propose a reliable ECG indicator Q(n) that shows the difference of the first R-R interval and the second R-R interval. The position of second R-R interval can be found by searching directly. The Q(n) is given by

$$Q(n) = \frac{\text{position of 2nd peak}(n) - \text{position of 1st peak}(n)}{\text{position of 1st peak}(n)}$$
(4.3)

### 4.3 Experiments

We estimate the heart rate every 512 samples, that is, updates the heart rate about one second since the sample rate is 512 Hz. Before we use correlation method to find the R-R interval, removing the baseline wander component is needed. The ECG signal with baseline wander component shows in Fig. 4.2(a). Fig. 4.2(b) shows the correlation value of Fig. 4.2(a). Since the correlation value is the product and summation of each point, the baseline wander component primarily decides the correlation value. Fig. 4.3(b) shows the frequency domain of Fig. 4.3(a). When the ECG signal transfor to the frequency



domain, we can not find out the information about the R-R interval in frequency domain. Therefore, we have to perform the IFFT to the time domain.

Fig. 4.4 depicts the result of the correlation approach. The top figure in Fig. 4.4 is the input of the FFT without baseline wandering component. The bottom figure shows the result of the correlation method. The ECG data is kept the same as that in section 3.4. The peak of the correlation values occurs at shift length 181 samples by direct searching, then the heart rate per minute is estimated by previous formula  $60 \times \frac{200}{181} = 66.3$  per minute. Fig. 4.5 shows the original ECG data, the ECG signals without baseline wandering component, the heart rate estimate, and the reliable ECG indicator. The ECG data of a wearing person sits at first, then he walks and jogs. The heart rate estimation is correct when the person jogging, the reliable ECG indicator degrades by loose the contact of the electrodes.



Fig. 4.4: Correlation values by FFT and IFFT



Fig. 4.5: The heart rate estimation using correlation approach

# **Chapter 5**

# **Hardware Implementation**



Fig. 5.1: The integral structure of the real time heart rate estimation system

Fig. 5.1 depicts the entire hardware architecture. After receiving the ECG signals from wearable textile, we filter the power-line interference and the frequency does not belong to ECG. We need analog front-end circuits (AFE). The circuits also amplify ECG signals for the A/D converter. After A/D converter, the signals is transmitted to an FPGA for signal processing and heart rate estimation. Finally The heart rate is displayed on the seven segment.

### 5.1 Analog Circuit Design

In order to observe and measure the weak signals (about 1mV) from body, amplify the signal to proper voltage level is needed. Therefore, one of the purpose of analog front-end circuits is to amplify the weak ECG signal. Then removes the power-line interference 60 Hz and obtain the ECG signal frequency band. Fig. 5.2 shows the entire analog circuits.



Fig. 5.2: Basic analog circuits for signal preprocessing

The circuits are worked on 3.3V single supply.

#### 5.1.1 Instrumentation Amplifier

The instrumentation amplifier should provide the high common-mode rejection ratio (CMRR) and the high input impedance. In general, the instrumentation circuit needs several oper-

ational amplifiers. However, we adopt AD623 to achieve that has low noise, high CMRR, high input impedance, low input bias current and can operate in 3.3V single supply.

#### 5.1.2 Notch Filter

Since ECG signals are easily affected by 60 Hz power-line interference or couple with noise from surroundings. The notch filter aim to filter 60Hz. We adopt a twin T circuit as a notch filter is shown in Fig. 5.3. We use MultiSim to confirm the frequency response. The result is performed in Fig. 5.4.



Fig. 5.3: Notch filter

#### 5.1.3 Lowpass Filter

In order to achieve the bandpass filter, we can combine the lowpass filter and the high pass filter as a bandpass filter. In lowpass filter circuits, we adopt the Sallen-Key active filter to realize [17]. Fig. 5.5 shows a generic second order Sallen-Key lowpass filter. The cutoff frequency can obtain by following equation

$$f_c = \frac{1}{2\pi\sqrt{R_1 R_2 C_1 C_2}}$$
(5.1)

The simulation result is shown in Fig. 5.6

Fig. 5.6 is the frequency response of lowpass filter that we designed.



Fig. 5.4: Frequency response of the designed notch filter



Fig. 5.5: Second order Sallen-Key lowpass filter

#### 5.1.4 Highpass Filter

The highpass filter also use a second order Sallen-Key highpass filter. Fig. 5.7 is the highpass filter that we designed. The cutoff frequency  $f_c$  is also computed by (5.1) and the cutoff frequency set to be 0.2 Hz. The frequency response is shown in Fig. 5.8.



Fig. 5.6: Frequency response of the designed lowpass filter



Fig. 5.7: Second order Sallen-Key highpass filter

#### 5.1.5 Output Stage Amplifier

After we filter the frequency of interference, it is necessary to amplify the signals for A/D converter. We design a noninverting amplifier with a large gain by following equation

$$V_{out} = \left(1 + \frac{R_b}{R_a}\right)V_{in} + V_{bias}$$
(5.2)

The  $R_a$  and  $R_b$  is chosen 1 k $\Omega$  and 600 k $\Omega$  respectively, and the gain achieve to 600 approximately. The amplifier circuits is shown in Fig. 5.9. The bias voltage 1.2V that



Fig. 5.8: Frequency response of the designed highpass filter



Fig. 5.9: Output stage amplifier

A/D converter provided without using operation amplifier.

### 5.2 Analog-to-Digital Converter

Before signal processing and estimates the heart rate on the FPGA, it is necessary to convert the analog ECG signal without interference to digital. We adopt the analog-to-digital converter (ADC) LTC1282 that Linear Technology produced. The main features of LTC1282 are

- Single supply 3V or dual supply  $\pm$ 3V.
- 140ksps throughput rate. Resolution is 12-bit.
- Internal synchronized clock.
- 12mW low power consumption.
- $6\mu s$  maximum conversion time.

The operation works on 0V to 2.5V conversion range, single supply 3V. Fig. 5.10 shows the typical application with single supply, and the pin functions is shown in Table. 5.1 and Table. 5.2



Fig. 5.10: Analog-to digital conter, LTC1282

Since A/D converter has internal clock, it does not need external clock to synchronize the control line. We can control the conversion by HSBN,  $\overline{CS}$ ,  $\overline{RD}$ . These input is produced by FPGA. First we set HSBN low to enable the slow memory mode. Control  $\overline{CS}$ and  $\overline{RD}$  low to read analog signal. Therefore. regular low for  $\overline{CS}$  and  $\overline{RD}$  can achieve the 512Hz sampling frequency. During the conversion status, the  $\overline{BUSY}$  signal from A/D

Table 5.1: Pin functions of input

| Inputs                   |                                                              |
|--------------------------|--------------------------------------------------------------|
| $A_{\rm IN}$             | Analog input. 0V to 2.5V (Unipolar) $\pm 1.25$ V (Bipolar)   |
| HSBN                     | High byte enable input.                                      |
|                          | This pin is used to multiplex the internal 12-bit conversion |
|                          | result into the lower bit outputs.                           |
| $\overline{\mathrm{CS}}$ | The Chip Select input must be low for the ADC to recognize   |
|                          | $\overline{\mathrm{RD}}$ and HBEN inputs.                    |
| $\overline{\mathrm{RD}}$ | Read input. The active low signal starts a conversion when   |
|                          | $\overline{\mathrm{CS}}$ and HSBN are low.                   |



converter ouput change to low and  $\overline{\text{BUSY}}$  return to high when conversion is over. The result output on pins D11,..., D0 in parallel. The timing diagram of the A/D converter is show in Fig. 5.11

### 5.3 FPGA System

A Field Programmable Gate Array (FPGA) is a semiconductor device that has programmable logic element. The logic element can be programmed to achieve the function of basic gate circuit like NOT, AND, OR and XOR, or some complex function such as mathematical



Fig. 5.11: Timing diagram of LTC1282

functions, decoders and flip-flop. FPGA can raise the speed of verification and reduce the cost. We adopt the Altera Cyclone II FPGA, shown in Fig. 5.12 to develop the hardware implementation. the features as follows

- EP2C35F672 Cyclone II FPGA Chip
- With 33,216 logic element
- M4K embedded memory blocks
- A 2.0 inch low temperature poly-silicon LCD
- 35  $18 \times 18$ -bit multipliers or  $9 \times 9$ -bit multipliers
- Supports active serial and JTAG configuration modes
- Supports Intellectual Property (IP) including Altera megafunction



Fig. 5.12: FPGA board

## 5.4 Baseline Wander Removal

Fig. 5.13 shows the block diagram of baseline wander removal. The adaptive subspace algorithm is summarized by the following equations

$$\boldsymbol{s}(n) = \alpha \boldsymbol{s}(n-1) + (1-\alpha)\boldsymbol{i}(n)\boldsymbol{i}^{T}(n)\boldsymbol{z}(n-1)$$
(5.3)

$$\boldsymbol{z}(n) = \frac{\boldsymbol{s}(n)}{\|\boldsymbol{s}(n)\|}$$
(5.4)

$$\boldsymbol{b}(n) = \boldsymbol{i}^{T}(n)\boldsymbol{z}(n)\boldsymbol{z}(n)$$
(5.5)

$$\boldsymbol{y}(n) = \boldsymbol{i}(n) - \boldsymbol{b}(n) \tag{5.6}$$

We perform the adaptive baseline wander removal sequentially from (5.3)–(5.6). In the baseline wander removal module, it needs a controller to decide the order of computation. One divider and one square root to calculate (5.4). One multiplier and one accumulator to compute all multiplication and inner product. One adder and one substractor perform the remaining operation. In the module we need three vectors to store the data; i(n), s(n) and z(n) are 12 bits, 26 bits, 10 bits, respectively. The length of three vectors are all 40. The control circuit includes a finite state machine (FSM). The FSM has two different types, Moore and Mealy [18]. In the Moore machine, the output is dependent on current



Fig. 5.13: Block diagram of baseline wander removal

state only. In the Mealy machine, the output is dependent on the current state and the input. In baseline wander removal module we use Mealy machine. There are four states in FSM, idle, BWR\_in, BWR\_com and BWR\_out, are shown in Fig 5.14. The first state is idle that initials and resets the module to wait the starting signal and input data. When the starting signal, IN\_VALID, is low, the idle state change to BWR\_in state and receive the input data to the vector i(n). In section 5.2 we know  $\overline{\text{BUSY}}$  is low when the conversion is in progress. The state change to BWR\_com state when IN\_VALID is high, and output the result in BWR\_out when the state is complete.

The working frequency of baseline wander removal module and heart rate estimator module is set to be 25 MHz. In BWR\_com state, we compute the inner product of i(n)and z(n-1) first. Since z(n) is always less than unity, we scale up the z(n) by  $2^{10}$  for decimal arithmetic. Then the inner product of i(n) and z(n-1) multiply  $1 - \alpha$  and i(n).  $\alpha$  and  $1 - \alpha$  are scaled up by  $2^{12}$ .

The next step, (5.4), needs one divider and one square root operation. Before performing the equation (5.4), we use multiplier and accumulator to caculate  $||s(n)||^2$ , and the result of (5.4) right shift 10 bits to obtain the norm. After (5.4), we need to obtain the baseline wandering component by (5.5). Therefore, the foregoing multiplier and accu-



mulator can be used to compute the inner product of i(n) and z(n) again and multiplies by latest z(n). We obtain the final output by subtracting the baseline wander component b(n) from input vector i(n) and the state returns to idle.

### 5.5 Fast Fourier Transform Algorithm

Discrete Fourier Transform (DFT) is used to widespread application in spectrum analysis, communication system and convolution. But it has mass calculation by computing directly. In the mid-1960, Cooley and Turkey propose the fast algorithm [19], called the Fast Fourier Transform (FFT), that can improve the large calculation when we implement the large N-point FFT. We introduce the basic of FFT algorithm [20].

• Radix-2 Algorithm

The N-point DFT of complex sequence x[n] in the time domain is defined as fol-

lows

$$X[k] = \sum_{n=0}^{N-1} x[n] W_N^{nk}, k = 0, ..., N-1$$
(5.7)

where  $W_N^{nk}$  is the twiddle factor, that is

$$W_N^{nk} = e^{-j(\frac{2\pi}{N})nk} \tag{5.8}$$

Since each X[k] is sum of N products, it is necessary  $N^2$  complex multiplications and  $N^2$  complex additions to compute directly. The computing complexity is  $O(N^2)$ . However, separates x[n] into two sequences of even part and odd part. i.e

$$X[k] = \sum_{n=0}^{N-1} x[n] W_N^{nk}$$
  

$$= \sum_{n \text{ even}} x[n] W_N^{nk} + \sum_{n \text{ odd}} x[n] W_N^{nk}$$
  

$$= \sum_{n=0}^{N-1} x[2n] W_N^{2nk} + \sum_{n=0}^{N-1} x[2n+1] W_N^{(2n+1)k}$$
  

$$= \sum_{n=0}^{N-1} x[2n] W_{N/2}^{nk} + W_N^{k} \sum_{n=0}^{N-1} x[2n+1] W_{N/2}^{nk}$$
  

$$= X_e[k] + W_N^{nk} X_o[k]$$
  
(5.9)

where  $X_e[k]$  represents the  $\frac{N}{2}$ -point DFT of even terms and  $X_o[k]$  represents the  $\frac{N}{2}$ -point DFT of odd terms. Since  $X_e[k]$  and  $X_o[k]$  are periodic with period  $\frac{N}{2}$ , we have  $X_e[k + \frac{N}{2}] = X_e[k]$  and  $X_o[k + \frac{N}{2}] = X_o[k]$ . The twiddle factor  $W_N^{n(k+N/2)} = -W_N^{nk}$ . Each  $\frac{N}{2}$ -point perform the same procedure by combining the  $\frac{N}{4}$ -point DFTs of even and odd terms. The radix-2 decimation-in-time (DIT) FFT algorithm can be applied until the length of DFTs reach to be 2. Finally, the result of an DIT 8-point FFT is shown in Fig. 5.15(a). The radix-2 FFT algorithm computation is regular that can compute recursively for each stage. Fig. 5.15(b) shows the basic computational pair is known as the butterfly computation that requires only one complex multiplication. The N-point DFT can be decomposed into  $\log_2 N$  stage. Each stage requires N complex multiplications and additions is of the order of  $N \log_2 N$ . Thus, the complexity is  $O(N \log_2 N)$ . Comparison of equation (5.7) and (5.9), the



Fig. 5.15: (a) Diagram of 8-point decimation-in-time FFT (b) Decimation-in-time butterfly

reduction of complexity is achieved. Similarly, the DFT can also be separated x[n]



Fig. 5.16: (a) Diagram of 8-point decimation-in-frequency FFT (b) Decimation-in-frequency butterfly

into  $\frac{N}{2}$  of odd and even terms in frequency domain respectively.

$$X[2r] = \sum_{n=0}^{N-1} x[n] W_N^{n2r}, r = 0, \dots, N/2 - 1$$
  
=  $\sum_{n=0}^{N/2-1} x[n] W_N^{n2r} + \sum_{n=N/2}^{N-1} x[n] W_N^{n2r}$   
=  $\sum_{n=0}^{N-1} x[n] W_N^{n2r} + \sum_{n=0}^{N-1} x[n+N/2] W_N^{2(n+N/2)r}$   
=  $\sum_{n=0}^{N-1} x[n] + x[n] + x[n] + N/2 W_N^{2nr}$  (5.10)

$$X[2r+1] = \sum_{n=0}^{N-1} x[n] W_N^{n(2r+1)}$$
  
=  $\sum_{n=0}^{N/2-1} x[n] W_N^{n(2r+1)} + \sum_{n=N/2}^{N-1} x[n] W_N^{n(2r+1)}$   
=  $\sum_{n=0}^{N-1} x[n] W_N^{n(2r+1)} + \sum_{n=0}^{N-1} x[n+N/2] W_N^{(n+N/2)(2r+1)}$   
=  $\sum_{n=0}^{N-1} [x[n] - x[n+N/2]] W_N^{n(2r+1)}$  (5.11)

The same as decimation-in-time algorithm, each  $\frac{N}{2}$ -point DFT is computed by (5.10), (5.11) and  $\frac{N}{4}$ -point. Finally, we also obtain the diagram of 8-point decimation-in-frequency FFT in Fig. 5.16(a) The basic decimation-in-frequency is shown in Fig. 5.16(b) shows the diagram of 8-point decimation-in-frequency butterfly computation.

• Radix-4 Algorithm

Radix-4 algorithm is base in radix-2 algorithm that usually uses in large N since radix-2 algorithm in large N require too much computations. Radix-4 FFT algorithm is more efficient computation than radix-2, but radix-4 needs N is power of four and the butterfly computation is more complex than radix-2. Similarly to the radix-2 algorithm decimates the N-point into four  $\frac{N}{4}$  as follows

ESP

$$\begin{split} X[k] &= \sum_{n=0}^{N-1} x[n] W_N^{nk} \\ &= \sum_{n=0}^{N/4-1} x[n] W_N^{nk} + \sum_{n=N/4}^{N/2-1} x[n] W_N^{nk} + \sum_{n=N/2}^{3N/4-1} x[n] W_N^{nk} + \sum_{n=3N/4}^{N-1} x[n] W_N^{nk} \\ &= \sum_{n=0}^{N/4-1} x[n] W_N^{nk} + \sum_{n=0}^{N/4-1} x[n+N/4] W_N^{(n+N/4)k} \\ &+ \sum_{n=0}^{N/4-1} x[n+N/2] W_N^{(n+N/2)k} + \sum_{n=0}^{N/4-1} x[n+3N/4] W_N^{(n+3N/4)k} \\ &= \sum_{n=0}^{N/4-1} [x[n] + (-j)^k x[n+N/4] (-1)^k x[n+N/2] + (j)^k x[n+3N/4]] W_N^{nk} \end{split}$$
(5.12)

The equation (5.12) is not related on  $\frac{N}{4}$ -point DFT because the twiddle factor depends on N. Thus, we obtain the decimation-in-frequency (DIF) from (5.12). it follows

$$X[4k] = \sum_{n=0}^{N/4-1} x[n] + x[n+N/4] + x[n+N/2] + x[n+3N/4]W_{N/4}^{nk}$$
(5.13)

$$X[4k+1] = \sum_{n=0}^{N/4-1} x[n] - jx[n+N/4] - x[n+N/2] + jx[n+3N/4]W_N^{nk}W_{N/4}^{nk}$$
(5.14)

$$X[4k+2] = \sum_{n=0}^{N/4-1} x[n] - x[n+N/4] + x[n+N/2] - x[n+3N/4]W_N^{2nk}W_{N/4}^{nk}$$
(5.15)

$$X[4k] = \sum_{n=0}^{N/4-1} x[n] + jx[n+N/4] - x[n+N/2] - jx[n+3N/4]W_N^{3nk}W_{N/4}^{nk}$$
(5.16)

Then it has four  $\frac{N}{4}$ -point DFTs that we can obtain the basic radix-4 butterfly according to (5.13), (5.14), (5.15), (5.16) as shown in Fig. 5.17(a) Fig. 5.17(b) shows the diagram of 16-point radix-4 FFT algorithm.

#### 5.6 FFT Architecture for Implementation

In section 5.5, we have introduced several FFT algorithm. It is important to choose the proper architecture for requirement of function. In this section we will introduce the architecture of FFT. They can be classified roughly into two types: the pipeline and the memory-based architecture.

#### 5.6.1 Memory-based Architecture

The Purpose of the memory-based architecture is increasing the reusable rate of memory and butterfly processing element(PE). Fig. 5.18 is the block diagram of memory-based architecture. It is composed by controller, processing element, address generator, RAM, and ROM. The controller control the RAM and ROM to read or write, and control the





Fig. 5.17: (a) Radix-4 decimation-in-frequency butterfly (b) Diagram of 16-point radix-4 DIF FFT

address generator to generate the specific address to RAM and ROM by difference stage of FFT. RAM store the data of FFT. ROM store the twiddle factor.

At first, RAM receives the input data, then according to the address which address generator produces, RAM and ROM read data to the PE to perform the butterfly computation. The result after PE in-place to the same RAM address. In general, it requires one PE to provide whole FFT.



Fig. 5.18: Block diagram of memory-based architecture

#### 5.6.2 Pipeline Architecture

The pipeline architecture has two types: single-path delay feedback (SDF) and multipath delay commutator (MDC) [21]. Both of them use a series of register as the delay buffer. Fig. 5.19 shows the pipeline architecture of SDF for radix-2 algorithm. Unlike the memory-based FFT architecture, SDF pipeline architecture can apply various FFT algorithm and maintain the regularity. SDF also has higher throughput than memory-



Fig. 5.19: Radix-2 DIF SDF



based but requires more hardware cost. The SDF operation has two modes shown in Fig. 5.20. Fig. 5.21 shows the block diagram of multi-path delay commutator (MDC).



Fig. 5.21: Radix-2 DIF MDC

MDC separate the input data into two parallel data streams, one directly output to the next stage, another multiply the coefficient. Using the switch to achieve the butterfly computation. In general, MDC architecture has higher throughput than SDF.

#### 5.7 Heart Rate Estimator

The primary module of heart rate estimator is FFT and IFFT modules. We can apply IFFT by FFT since FFT and IFFT architecture are similarly. Thus we introduce the proposed

FFT module in this section.

• FFT requirements:

Before we construct the FFT module, we have to define the specification from requirement of heart rate estimation system. In order to evaluate the heart rate, it must have two R waves to represent the R-R interval. The worst case is the R wave occurring in middle. The situation is shown in Fig. 5.22. The sampling frequency



is 512Hz and the minimum heart rate that we can estimate, set to be 30 per minute. Therefore, we need 4096-point FFT processor since it also has 2048-point zero padding. Thus, the FFT block diagram is shown in Fig. 5.23

We adopt the memory-based FFT architecture since the system operates in low frequency and needs low power and area for wearable system. The basic butterfly computation uses radix-4 algorithm and it is more efficient than radix-2.

In the FFT module, we use four states as shown in Fig. 5.24 First state is idle, it is a reset state that reset the FFT module to initial value. When IN\_VALID is high, FFT processor starts to receive the input data and the state change to input\_state . Then, when input completely, the input\_s is high and enters to the compute\_state to compute the FFT. It outputs the result at output\_state and returns to idle.



Fig. 5.24: State flow chart of the FFT module

In the FFT proposed processor, we use only one memory, more memory block reflect larger area and power dissipation. In input state, address generator produces the address to RAM with 32-bit, 4096 words that the first 16 bits are real part and the remaining are imaginary part. For radix-2 algorithm, we know when the input is order, the output is bit reverse [20]. In the proposed architecture, we use the same concept in radix-4 algorithm, group reverse, is shown in Fig. 5.25 During the compute\_state, the basic computation of



Fig. 5.25: Group reverse between input and output

FFT processor needs one radix-4 butterfly computation. We can decompose the 4096point FFT as following

• Stage 1

$$X[k] = \sum_{n=0}^{4095} x[n] W_{4096}^{nk}$$

$$\begin{cases} n = 1024n_1 + n_2 & n_1 = 0, \dots, 3, n_2 = 0, \dots, 1023 \\ k = k_1 + 4k_2 & k_1 = 0, \dots, 3, k_2 = 0, \dots, 1023 \end{cases}$$

$$X[k_1 + 4k_2] = \sum_{n_2=0}^{1023} \sum_{n_1=0}^{3} x[1024n_1 + n_2] W_{4096}^{(1024n_1 + n_2)(k_1 + 4K_2)}$$
$$= \sum_{n_2=0}^{1023} \sum_{n_1=0}^{3} x[1024n_1 + n_2] W_4^{n_1k_1} W_{4096}^{n_2k_1} W_{1024}^{n_2k_2}$$

Let 
$$G(n_2, k_1) = \sum_{n_1=0}^{3} x[1024n_1 + n_2] W_4^{n_1k_1} W_{4096}^{n_2k_1}$$
 (5.17)

• Stage 2

$$X[k_1 + 4K_2] = \sum_{n_2=0}^{1023} G(n_2, k_1) W_{1024}^{n_2 k_2}$$

$$\begin{cases} n_2 = 256n_3 + n_4 & n_3 = 0, \dots, 3, n_4 = 0, \dots, 255 \\ k_2 = k_3 + 4k_4 & k_3 = 0, \dots, 3, k_4 = 0, \dots, 255 \end{cases}$$

$$X[k_1 + 4k_3 + 16k_4] = \sum_{n_4=0}^{255} \sum_{n_3=0}^{3} G(256n_3 + n_4, k_1) W_{1024}^{(256n_3 + n_4)(k_3 + 4k_4)}$$
$$= \sum_{n_4=0}^{255} \sum_{n_3=0}^{3} G(256n_3 + n_4, k_1) W_4^{n_3k_3} W_{1024}^{n_4k_3} W_{256}^{n_4k_4}$$

Let 
$$H(n_4, k_1, k_3) = \sum_{n_3=0}^{3} G(256n_3 + n_4, k_1) W_4^{n_3k_3} W_{1024}^{n_4k_3}$$
 (5.18)

• Stage 3

$$X[k_1 + 4k_3 + 16k_4] = \sum_{n_4=0}^{255} H(n_4, k_1, k_3) W_{256}^{n_4k_4}$$

$$\begin{cases} n_4 = 64n_5 + n_6 & n_5 = 0, \dots, 3, n_6 = 0, \dots, 63 \\ k_4 = k_5 + 4k_6 & k_5 = 0, \dots, 3, k_6 = 0, \dots, 63 \\ X[k_1 + 4k_3 + 16k_5 & 1896 & 1896 & 1896 & 1896 \\ + 64k_6] = \sum_{n_6=0}^{63} \sum_{n_5=0}^{63} H(64n_5 + n_6, k_1, k_3) W_{64}^{(64n_5 + n_6)(k_5 + 4k_6)} \\ = \sum_{n_6=0}^{63} \sum_{n_5=0}^{3} H(64n_5 + n_6, k_1, k_3) W_4^{n_5k_5} W_{256}^{n_6k_5} W_{64}^{n_6k_6} \end{cases}$$

Let 
$$I(n_6, k_1, k_3, k_5) = \sum_{n_5=0}^{5} H(64n_5 + n_6, k_1, k_3) W_4^{n_5k_5} W_{256}^{n_6k_5}$$
 (5.19)

• Stage 4

$$X[k_1 + 4k_3 + 16k_5 + 64k_6] = \sum_{n_6=0}^{63} I(n_6, k_1, k_3, k_5) W_{64}^{n_6k_6}$$

$$\begin{cases} n_6 = 16n_7 + n_8 & n_7 = 0, \dots, 3, n_8 = 0, \dots, 15\\ k_6 = k_7 + 4k_8 & k_7 = 0, \dots, 3, k_8 = 0, \dots, 15 \end{cases}$$

$$X[k_{1} + 4k_{3} + 16k_{5} + 64k_{7} + 256k_{8}] = \sum_{n_{8}=0}^{15} \sum_{n_{7}=0}^{3} I(64n_{7} + n_{8}, k_{1}, k_{3}, k_{5}) W_{64}^{(16n_{7} + n_{8})(k_{7} + 4k_{8})}$$
$$= \sum_{n_{6}=0}^{15} \sum_{n_{5}=0}^{3} I(64n_{7} + n_{8}, k_{1}, k_{3}, k_{5}) W_{4}^{n_{7}k_{7}} W_{64}^{n_{8}k_{7}} W_{16}^{n_{8}k_{8}}$$
Let  $J(n_{8}, k_{1}, k_{3}, k_{5}, k_{7}) = \sum_{n_{7}=0}^{3} I(64n_{7} + n_{8}, k_{1}, k_{3}, k_{5}) W_{4}^{n_{7}k_{7}} W_{64}^{n_{8}k_{7}}$  (5.20)

• Stage 5

$$X[k_{1} + 4k_{3} + 16k_{5} + 64k_{7} + 256k_{8}] = \sum_{n_{8}=0}^{15} J(n_{8}, k_{1}, k_{3}, k_{5}, k_{7})W_{16}^{n_{8}k_{8}}$$

$$\begin{cases} n_{8} = 4n_{9} + n_{10} \quad n_{9} = 0, \dots, 3, \quad n_{10} = 0, \dots, 3\\ k_{8} = k_{9} + 4k_{10} \quad k_{9} = 0, \dots, 3, \quad k_{10} = 0, \dots, 3 \end{cases}$$

$$X[k_{1} + 4k_{3} + 16k_{5} + 64k_{7} + 256k_{9} + 1024k_{10}] = \sum_{n_{10}=0}^{3} \sum_{n_{9}=0}^{3} J(4n_{9} + n_{10}, k_{1}, k_{3}, k_{5}, k_{7})W_{16}^{(4n_{9} + n_{10})(k_{9} + 4k_{10})}$$

$$= \sum_{n_{10}=0}^{3} \sum_{n_{9}=0}^{3} J(4n_{9} + n_{10}, k_{1}, k_{3}, k_{5}, k_{7})W_{4}^{n_{9}k_{9}}W_{16}^{n_{10}k_{9}}W_{4}^{n_{10}k_{10}}$$
Let  $K(n_{10}, k_{1}, k_{3}, k_{5}, k_{7}, k_{9}) = \sum_{n_{9}=0}^{3} J(4n_{9} + n_{10}, k_{1}, k_{3}, k_{5}, k_{7})W_{4}^{n_{9}k_{9}}W_{16}^{n_{10}k_{9}}$ 
(5.21)

• Stage 6

$$X[k_1 + 4k_3 + 16k_5 + 64k_7 + 256k_9 + 1024k_{10}] = \sum_{n_{10}=0}^3 K(n_{10}, k_1, k_3, k_5, k_7, k_9) W_4^{n_{10}k_{10}}$$
(5.22)

The property of twiddle factor is symmetry, we only need 1/8 twiddle factors in ROM. Fig. 5.26 shows the symmetry property of twiddle factor zone 0 are kept in ROM, other seven zones are converted from the twiddle factor in zone 0.



Fig. 5.26: The symmetry property of twiddle factor

In butterfly computation we use 26 adders and 4 multiplier. Each computation for butterfly needs 4 clocks to read and 4 clocks to write back. For memory assignment, we propose a way to generate the address for RAM and ROM without computation. During FFT computation, the counter accumulates for each stage, we use the counter to produce the address that group slide the counter. Fig. 5.27 depicts the addressing algorithm of FFT. Where  $C_{11}, C_{10}, \ldots, C_1$  are bits of counter. From equation (5.17)–(5.22), we know in the stage 1, the address of RAM is the same as counter. Stage 2  $c_{11}c_{10}$  move to LSB and other bits shift left 2 bits. perform the same thing for each stage can produce the address for RAM.

For the same reason, ROM address can also produce by counter without computation. From (5.17)–(5.22), we know  $W_{4096}^{n_2k_1}, W_{1024}^{n_4k_3}, W_{256}^{n_6k_5}, W_{64}^{n_8k_7}, W_{16}^{n_{10}k_9}$  are twiddle factors that keeps in ROM. The ROM address assignment can provide by counter in Fig. 5.28.

The inverse FFT (IFFT) can be realized by the same architecture of FFT. the IDFT formula is given by

$$x[n] = \frac{1}{N} \sum_{k=0}^{N-1} X[k] W_N^{-nk}$$
(5.23)



Fig. 5.28: ROM address assignment of FFT

Multiplying both sides of (5.23) by N, we have

$$Nx[n] = \sum_{k=0}^{N-1} X[k] W_N^{-nk}$$
  

$$Nx^*[n] = \sum_{k=0}^{N-1} X^*[k] W_N^{nk}$$
  

$$= \sum_{k=0}^{N-1} X[k] W_N^{nk} \text{ if } X[k] \text{ is real}$$
(5.24)

Thus, we can perform the IFFT by computing FFT directly.

### 5.8 Experimental and Result



Fig. 5.29: Brief system specifications

The brief system specifications are shown in Fig 5.29. After we design the FFT module and apply to real time heart rate estimation system, we compile the RTL code to FPGA and the compile report is shown in Fig. 5.30. In the report, we use 7541 logic elements which is 23 percent of the total logic elements in this FPGA board. And the used registers occupies 8 percent of all dedicate logic registers. The usage of on-chip memory is 38 percent. The result display on the FPGA device is shown in Fig. 5.31 The first three seven-segment display that express the signal quality indicator is 0.94. The best is 1, and the other three seven-segment display show the heart rate, 74 heartbeats per minute.

| Flow Status                                       | Successful - Thu May 26 15:09:14 2011    |
|---------------------------------------------------|------------------------------------------|
| Quartus II Version                                | 10.1 Build 153 11/29/2010 SJ Web Edition |
| Revision Name                                     | HeartRateEstimator                       |
| Top-level Entity Name                             | HeartRateEstimator                       |
| Family                                            | Cyclone II                               |
| Device                                            | EP2C35F672C8                             |
| Timing Models                                     | Final                                    |
| Total logic elements                              | 7,541 / 33,216 ( 23 % )                  |
| <ul> <li>Total combinational functions</li> </ul> | 7,234 / 33,216 ( 22 % )                  |
| Dedicated logic registers                         | 2,765 / 33,216 (8 %)                     |
| Total registers                                   | 2765                                     |
| Total pins                                        | 78 / 475 ( 16 % )                        |
| Total virtual pins                                | 0                                        |
| Total memory bits                                 | 182,784 / 483,840 ( 38 % )               |
| Embedded Multiplier 9-bit elements                | 16 / 70 ( 23 % )                         |
| ····· Total PLLs                                  | 0/4(0%)                                  |

#### Fig. 5.30: Compile report of the entire system



Fig. 5.31: Results display on the FPGA device

# **Chapter 6**

# **Conclusion and Future Work**

In this thesis, we focus on a memory-based architecture FFT design for the heart rate estimator. We use one single-port memory block to achieve radix-4 FFT, and the memory address assignment depends on a counter and current stage. We reduce the complexity of the address generator and apply it in low frequency system.

In the future work, The system can add the function of wireless transmission that can be transmitted to central terminal and can then be analyzed the waveform by computers or cellphones. In FPGA implementation, real-value architecture may be applied in system to increase the efficiency and reduce the power dissipation of the system. Consider the long FFT idle time since we estimate the heart rate every 512 samples, the FFT can use more clock cycle to compute FFT and IFFT and reduce the power consumption and area of FFT module. For minimizing the area of the system, we can design an application specific integrated circuit (ASIC) that incorporate with the front-end circuits and A/D converter.

# **Bibliography**

- [1] M Di Rienzo, F Rizzo, G Parati, M Ferratini, G Brambilla, and P Castiglioni "A Textile-Based Wearable System for Vital Sign Monitoring: Applicability in Cardiac Patients " Computers in Cardiology, pp. 699V701, 2005.
- [2] H. Ding, S. Crozier, and S. Wilson, "A New Heart Rate Variability Analysis Method by Means of Quantifying the Variation of Nonlinear Dynamic Patterns," *IEEE Transactions on Biomedical Engineering*, vol. 54, no. 9, 2007.
- [3] K. Hung, Y. T. Zhang, and B. Tai, "Wearable Medical Devices for Tele-Home Healthcare", Proc. of 26th Int. Conf. IEEE EMBS, San Francisco, pp. 5384-5387, Sept. 2004.
- [4] J. Muhlsteff, O. Such, R. Schmidt, M. Perkuhn, H. Reiter, J. Lauter, J. Thijs, G. Musch, M. Harris, "Wearable approach for continuous ECG and activity patient-monitoring", 2004. IEMBS '04. 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2004, pp. 2184-2187. 2004.
- [5] L. Grajales, I.V. Nicolaescu, "Wearable multisensor heart rate monitor" International Workshop on Wearable and Implantable Body Sensor Networks, pp. 153-157, 2006.
- [6] Kher, R. Vala, D. Pawar, T. Thakar, V.K. "Implementation of derivative based QRS complex detection methods" 2010 3rd International Conference on Biomedical Engineering and Informatics (BMEI), pp. 927-931, 2010.
- [7] Dinh, H.A.N. Kumar, D.K. Pah, N.D. Burton, P. "Wavelets for QRS detection" Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2001, pp. 1883-1887, 2001.

- [8] Ji-Le Hu, Shu-Di Bao, "An approach to QRS complex detection based on multiscale mathematical morphology" 2010 3rd International Conference on Biomedical Engineering and Informatics (BMEI), pp. 725-729, 2010
- [9] J. G. Webster, Medical Instrumentation Application and Design, John Wiley and Sons, Inc., 1998.
- [10] G. H. Golub and C. F. Van Loan, *Matrix Computation*, Baltimore, MD: John Hopkins University Press, 1989.
- [11] M. H. Cheng, L. C. Chen and Y. C. Hung, "A Real-Time Heart-Rate Estimator from Steel Textile ECG Sensors in a Wireless Vital Wearing System," 2nd Int. Conf. Bioinformatics and Biomedical Engineering, Shanghai, China, May 2008.
- [12] B-U Kohler et al, "The principles of software QRS detection," IEEE Eng. in Medicine and Biology Magazine, vol. 21, pp. 42-57, Jan. 2002.
- [13] C. Li, C. Zheng, and C. Tai, Detection of ECG characteristic points using Wavelet Transforms, *IEEE Transactions on Biomedical Engineering*, vol. 42, no. 1, pp. 21-28, Jan., 1995.
- [14] V. X. Afonso, W. J. Tompkins, T. Q. Nguyen, and S. Luo. "ECG Beat Detection using Filter Banks". *IEEE Trans. on Biomedical Engineering*, 46(2):192-202, Feb. 1999.
- [15] P. Maragos and R. W. Schafer. "Morphological Systems for Multidimensional Signal Processing." *Proceedings of the IEEE*, 78(4):690-710, April 1990.
- [16] Leland B. Jackson Digital Filters and Signal Processing, Kluwer Academic Publishers, 1996.
- [17] M. E. Van Valkenburg, Analog Filter Design, Oxford University Press, 1995.
- [18] N. Weste and D. Harris, CMOS VLSI Design: A Circuits and Systems Perspective, Addison Wesley, 2004.

- [19] J. W. Cooley and J. W. Tukey, "An Algorithm for the Machine Calculation of Complex Fourier Series," *Math.Comp.*, Vol. 19, pp. 297-301, April 1965.
- [20] Alan V. Oppenheim and Ronald W. Schafer, *Discrete-Time Signal Processing*, Prentice Hall International, Inc., 1999.
- [21] S. He and M. Torkelson, "Design and implementation of a 1024-point pipeline FFT processor," in Proc. IEEE Custom Integrated Circuits Conf., pp. 131-134, May 1998

