

# Application

| 2015-04347       | Gustafsson, Oscar                      | NT-14                                                     |
|------------------|----------------------------------------|-----------------------------------------------------------|
| Information      | about applicant                        |                                                           |
| Name: Oscar      | Gustafsson                             | Doctorial degree: 2003-09-26                              |
| Birthdate: 197   | 30330                                  | Academic title: Docent                                    |
| Gender: Male     |                                        | Employer: Linköpings universitet                          |
| Administrating   | <b>g organisation:</b> Linköpings univ | ersitet                                                   |
| Project site: In | nstitutionen för systemteknik          | (ISY)                                                     |
| Information      | about application                      |                                                           |
| Call name: For   | rskningsbidrag Stora utlysnin          | gen 2015 (Naturvetenskap och teknikvetenskap)             |
| Type of grant:   | Projektbidrag                          |                                                           |
| Focus: Fri       |                                        |                                                           |
| Subject area:    |                                        |                                                           |
| Project title (e | nglish): Low-Power and High-S          | peed Digital Signal Processing - Impact of Parallelism on |
| Computation      | al Complexity                          |                                                           |
| Project start: 2 | 2016-01-01                             | Project end: 2019-12-31                                   |
| Review panel     | applied for: NT-14, NT-13              |                                                           |
| Classification   | <b>code:</b> 20299. Annan elektrotek   | nik och elektronik, 20205. Signalbehandling               |
| Keywords: Par    | rallellism, Signalbehandling,          | Redundans, Effektförbrukning, Integrerade kretsar         |
| Funds applie     | d for                                  |                                                           |
| Year:            | 2016 2017 2018 2019                    |                                                           |
| Amount:          | 1,154 1,266 1,299 1,328                |                                                           |
| Participants     |                                        |                                                           |
| Name:Håkan       | Johansson                              | Doctorial degree: 1998-05-18                              |
| Birthdate: 196   | 90701                                  | Academic title: Professor                                 |
| Gender: Male     |                                        | Employer: Linköpings universitet                          |

## **Descriptive data**

# Project info

# Project title (Swedish)\*

Snabb och energieffektiv signalbehandling - inverkan av parallellism i algoritm och arkitektur

# **Project title (English)\***

Low-Power and High-Speed Digital Signal Processing - Impact of Parallelism on Computational Complexity

# Abstract (English)\*

The required data rates of communication systems increase rapidly over time and in many cases, such as mobile communication, this is the primary driving force of new standards and technologies. The computational effort typically increases faster than linearly with the data rate as often transmission schemes with higher complexity must be used to increase the data rate. From an integrated circuit implementation point of view, keeping the power and energy consumption low is the major design challenge, something that becomes worse when the amount of computations and data rates increase. The power consumption mainly affects the short-term properties, e.g., peak power supply current and temperature, while the energy consumption becomes a limiting factor for battery-supplied applications. One motivation for research on energy efficient implementation is, naturally, that for mobile devices the battery life time will be increased if the energy consumption can be decreased. Also, cooling becomes a major problem when the power consumption is high, as well as supplying enough current to the integrated circuits that perform the computations. Taking the environmental issues into account, as there are billions of integrated circuits operating around the world, it is clear that there is a generic interest in keeping the power end energy consumption as low as possible.

Primarily, we will study the effect of parallelism on the implementation of high-speed digital processing algorithms, typically for digital front-ends. A digital front-end is here defined as the unit performing digital signal processing close to the data converters\footnote {It is not of interest to define the exact cut between the digital front-end and the baseband processing here. For example, in OFDM systems, it can be argued if the DFT/IDFT is part of the digital front-end or the baseband signal processing. As it is a well defined function operating at a high sample rate, it is a function to be considered in this project.}. With high-speed, we here refer to systems where the \textbf{data rate is higher than the obtainable clock rate}, leading to that multiple samples must be processed in parallel, typically data rates in the GSa/s range. Examples include transmitters and receivers for optical communication, gigabit networking, wireless and wired transmission of high-definition video, radar, and parallel data acquisition as in oscilloscopes and software-defined radio. Note that the results will also be applicable for circuits working at lower sample rates, enabling close to energy-optimal operation in the subthreshold region.

Of particular interest is to determine an optimal degree of parallelism, i.e., a ratio between the data rate and the clock frequency. A higher ratio will increase the chip area as the computational cores are replicated, while it at the same time increases the opportunity to modify the algorithm in a favorable way such that the chip area increases less than linearly. This is performed by utilizing computational redundancy and the fact that operators become more specialized. Similarly, a higher ratio may increase the static power consumption, while at the same time decrease the dynamic power consumption as it enables a lower power supply voltage. The project will focus on fundamental DSP algorithms such as digital filters and transforms, which find applications in practically all communication systems, although we will identify general principles.

# Popular scientific description (Swedish)\*

När datatakterna ökar i kommunikationssystem, ökar också den mängd beräkningar som behöver göras per tidsenhet. Detta ställer större och större krav på de kretsar som skall utföra beräkningarna så att de inte blir för varma eller drar för mycket energi som dels motsvarar batterilivslängd, men även påverkar energikonsumtionen i stort med tanke på hur många elektriska apparater som ständigt står på.

I de flesta fall måste beräkningarna ske parallellt så att många olika värden räknas ut samtidigt. Detta är olika svårt beroende på vilken typ av beräkningar som ska göras. För att göra riktigt effektiva kretsar behöver man utnyttja förenklingar så mycket som går. Mängden förenklingar beror på hur många värden som räknas ut samtidigt, så att det faktiskt blir relativt sett enklare att räkna ut många värden samtidigt. Dock är detta ett relativt eftersatt område inom forskningen då behovet att riktigt snabba beräkningar inte har varit så stort tidigare, så det är svårt att avgöra exakt hur många värden som skall räknas ut samtidigt för att energin per uträknat värde ska bli så lågt som möjligt. Samtidigt som ett grundkrav så klart är att man hinner räkna ut tillräckligt med värden enligt de krav som ställs på kommunikationssystemet.

I detta projekt kommer vi att studera sambanden mellan mängden värden som räknas ut samtidigt och de följder det får på mängden beräkningar som behöver utföras och i slutänden följderna på de integrerade kretsarna. Detta kommer vi att göra för att man i framtiden bättre ska kunna konstruera integrerade kretsar med riktigt låg energiförbrukning samtidigt som de hinner med de allt snabbare datatakterna. Vi fokuserar på några av de mest grundläggande, men även mest förekommande, typerna av beräkningar: FIR-filter och FFT.

# **Project period**

Number of project years\* 4

Calculated project time\* 2016-01-01 - 2019-12-31

Classifications

Select a minimum of one and a maximum of three SCB-codes in order of priority.

Select the SCB-code in three levels and then click the lower plus-button to save your selection.

| SCB-codes* | 2. Teknik > 202. Elektroteknik och elektronik > 20299. Annan elektroteknik och elektronik |
|------------|-------------------------------------------------------------------------------------------|
|            | 2. Teknik > 202. Elektroteknik och elektronik > 20205.<br>Signalbehandling                |

Enter a minimum of three, and up to five, short keywords that describe your project.

Keyword 1\* Parallellism Keyword 2\* Signalbehandling Keyword 3\* Redundans Keyword 4 Effektförbrukning Keyword 5 Integrerade kretsar

## **Research plan**

# **Ethical considerations**

Specify any ethical issues that the project (or equivalent) raises, and describe how they will be addressed in your research. Also indicate the specific considerations that might be relevant to your application.

## **Reporting of ethical considerations\***

Inga av de aktuella forskningsfrågorna bedöms ha några etiska aspekter.

The project includes handling of personal data

No

The project includes animal experiments

No

Account of experiments on humans

No

**Research plan** 

# LOW-POWER AND HIGH-SPEED DIGITAL SIGNAL PROCESSING – IMPACT OF PARALLELISM ON COMPUTATIONAL COMPLEXITY

# 1 Purpose and Aims

The required data rates of communication systems increase rapidly over time and in many cases, such as mobile communication, this is the primary driving force of new standards and technologies. The computational effort typically increases faster than linearly with the data rate as often transmission schemes with higher complexity must be used to increase the data rate. From an integrated circuit implementation point of view, keeping the power and energy consumption low is the major design challenge, something that becomes worse when the amount of computations and data rates increase. The power consumption mainly affects the short-term properties, e.g., peak power supply current and temperature, while the energy consumption becomes a limiting factor for battery-supplied applications. One motivation for research on energy efficient implementation is, naturally, that for mobile devices the battery life time will be increased if the energy consumption can be decreased. Also, cooling becomes a major problem when the power consumptions. Taking the environmental issues into account, as there are billions of integrated circuits operating around the world, it is clear that there is a generic interest in keeping the power end energy consumption as low as possible.

Primarily, we will study the effect of parallelism on the implementation of high-speed digital processing algorithms, typically for digital front-ends. A digital front-end is here defined as the unit performing digital signal processing close to the data converters<sup>1</sup>. With high-speed, we here refer to systems where the **data rate is higher than the obtainable clock rate**, leading to that multiple samples must be processed in parallel, typically data rates in the GSa/s range. Examples include transmitters and receivers for optical communication, gigabit networking, wireless and wired transmission of high-definition video, radar, and parallel data acquisition as in oscilloscopes and software-defined radio. Note that the results will also be applicable for circuits working at lower sample rates, enabling close to energy-optimal operation in the subthreshold region.

Of particular interest is to determine an optimal **degree of parallelism**, i.e., a ratio between the data rate and the clock frequency. A higher ratio will increase the chip area as the computational cores are replicated, while it at the same time increases the opportunity to modify the algorithm in a favorable way such that the chip area increases less than linearly. This is performed by **utilizing computational redundancy** and the fact that **operators become more specialized**. Similarly, a higher ratio may increase the static power consumption, while at the same time decrease the dynamic power consumption as it enables a lower power supply voltage. The project will focus on **fundamental DSP algorithms** such as digital filters and transforms, which find applications in practically all communication systems, although we will **identify general principles**.

Efficient implementation of DSP and communication algorithms and arithmetic operators has been the research topic of the PI, Dr. Oscar Gustafsson, for a long time, and the co-PI, Prof. Håkan Johansson, contributes extensive experience in design of DSP and communication algorithms, providing a solid foundation for the success of the project.

# 2 Background and Survey of the Field

We first provide some background on power consumption in integrated circuits, current state-of-the-art, and examples of application areas where the results of this project can be relevant.

<sup>&</sup>lt;sup>1</sup>It is not of interest to define the exact cut between the digital front-end and the baseband processing here. For example, in OFDM systems, it can be argued if the DFT/IDFT is part of the digital front-end or the baseband signal processing. As it is a well defined function operating at a high sample rate, it is a function to be considered in this project.

Complementary metal oxide semiconductor (CMOS) technology is, and will remain, the technology of choice for most large-scale digital integrated circuits. The power consumption in CMOS can be split into three different factors [1]: dynamic, short-circuit and static. The dynamic power consumption decreases with the square of the power supply voltage while the corresponding speed decrease is significantly lower. Hence, it is feasible to parallelize a circuit, operate at a lower clock frequency, but as the power supply voltage can be decreased the dynamic power will be decreased. As a result, it is important to operate at the optimal degree of parallelism and power supply voltage [2, 3]. For modern CMOS processes with smaller line widths, the static power consumption tends to become more and more significant [4]. This is caused by the transistors not being fully turned off and increased leakage through the gate oxide. Increasing the parallelism will increase the static power consumption and as it is independent of clock frequency, the same benefit as for the dynamic power is not obtained. In fact, typically the transistors will increase the leakage current as they tend to be "less off" when operating at a lower power supply voltage. This can partly be counter-acted by adjusting the circuit body bias voltage. It should be noted that the energy-optimal power supply voltage (energy per operation) is typically obtained in subthreshold operation. However, as the clock frequency is drastically reduced when approaching the subthreshold region, the parallelism must typically be increased several orders of magnitude for the considered applications, leading to that the circuit area will increase significantly, in many cases yielding uneconomically large circuits. Still, the results obtained from this project will be of interest for lower rate processing with stringent specifications on the energy consumption, e.g., in implantable circuits.

# 2.2 State of the art

The project deals with several abstraction levels, from systems and applications, with a focus on algorithms and architectures, down to arithmetic and circuits. For specific algorithms and architectures, the state of the art is discussed when introduced in Section 3. The general problem of mapping DSP algorithms to hardware with different degrees of parallelism in the context of power supply voltage scaling was extensively studied in [3]. The methodology presented there is the currently most exhaustive approach, including, e.g., many techniques discussed in [5, 6]. However, as will be discussed and later summarized in Table 3, there are several aspects which were not covered. In particular, the main focus of this project, how parallelism effects the computational redundancy and the specialization of operators, was not covered.

# 2.3 Example application areas

While the objective of this project is fundamental research on the effect of parallelism on achievable energy dissipation limits, not product development, there are a number of recent and upcoming application areas that illustrates typical use cases for the outcomes of the project.

The first one is wireless transmission in the 60 GHz band, e.g., for wireless high-definition video and audio streams [7, 8]. As an example, the baseband data stream of WirelessHD [8] has a sample rate of around 2.5 GSa/s. With a 512-point FFT this leads to that about 11.5 billion complex multiplications should be performed per second, something that requires a dedicated FFT architecture [9].

The next example is optical fiber transmission, reaching hundreds of Gb/s and in the future several Tb/s. DSP is used more and more to mitigate errors at the end points. For example, chromatic dispersion can be compensated using FIR filters but these often have a length of hundreds or thousands of taps for long fibers [10, 11]. Hence, the project will alleviate higher data rates and better correction (longer filter lengths) at a reduced power budget. Also, optical transmission calls for high-speed data converters to be able to meet the high sample rate requirements.

The general case of high-speed analog-to-digital conversion is not only a problem for optical transmission, but also for wide-band communication systems, e.g., using direct sampling of the antenna, and for other types of data acquisition, e.g., oscilloscopes and spectrometers. One efficient way to increase the sample rate is to use time-interleaved data converters. However, as these are not identical due to process variations, DSP techniques are used to mitigate the effects [12], and the corresponding processing is obviously required to operate at high data rates.

The core in most spectrometers is a high data rate high resolution FFT [13, 14]. The number of points should, in general, be as large as possible to obtain high spectral resolution. This is limited by implementation technology and power budget. Therefore this project will lead to higher resolution spectrometers within the same or reduced power/area budget.

Other areas using high-speed DSP which uses the considered fundamental algorithms considered include radar, spectrum sensing [15], and satellite communication.

# **3 Project Description – Impact of Parallelism**

The project studies the impact of parallelism in the implementation of high-speed digital front-ends. Note that withing the project we treat the sample rate as a requirement, not something that can be adjusted, and, hence, the clock frequency is uniquely determined by the number of samples that is processed in parallel. As the number of operations per sample is typically high, as well as the sample rate, the power consumption will always be significant for the considered applications. The challenge is therefore, simply put, to determine the parallelism degree such that the energy per sample is minimized. However, as we will see there are many aspects on several abstraction levels that must be considered.

In this context, several different aspects will be systematically considered. The objective is to provide better insights into what happens on the implementation level when algorithms are parallelized, what the best degree of parallelism from an implementation (power) point-of-view is, and new systematic techniques to analyze and design these parallel algorithms and architectures.

# **3.1** Pipelining and logic synthesis

Pipelining is the process of introducing registers into the critical path to enable a shorter critical path, leading to higher clock frequencies. A drawback of introducing pipelining is that the registers will naturally consume power as well. Also, while pipelining enables really high clock frequencies, it is not always of interest to clock integrated circuits at multi-GHz frequencies as it will introduce severe constraints on the placement, routing, and clock distribution networks. Furthermore, introducing many pipelining levels will lead to that only very small logic functions can be performed between each pipelining level, increasing the complexity of reconfiguration and control. Finally, as discussed in Section 3.2, pipelining can not be introduced in loops.

An alternative way to increase the clock frequency is to use operators with a shorter critical path. This typically corresponds to larger area, and, in general, one needs to find a suitable trade-off between the clock frequency (determined by the sample rate/parallelism ratio), the operator complexity and the amount of pipelining. This is one of the problems that will inherently be studied in this project.

**Research tasks** Newer CMOS technologies typically provide transistors (standard cells) with different threshold voltages, where typically one set provides a low threshold voltage and, therefore, high-speed operation, but at the same time higher static power consumption, and one set with a higher threshold voltage, leading to slower operation but lower static power consumption. In addition, there may be more intermediate threshold voltages. Clearly, increasing the parallelism will in many cases, but not all, allow a larger use of high threshold voltage cells, reducing the static power consumption despite introducing more transistors. Similarly, pipelining may have the same effect. However, it should be noted that if there are several parallel signal streams in an architecture and only one or a few of the streams are limiting the sample rate, it is more useful to use high-speed transistors for that part only, rather than introducing additional pipelining to all streams, or increasing the parallelism. Hence, the trade-off is more complex than just increasing the parallelism until high threshold transistors can be used everywhere. This is an aspect that will be considered and quantified in the project, along with the operator selection.



Figure 1: Integrator expressed for (a) one and (b) two samples per iteration. Transformed integrator (c) without and (d) with additional pipelining to make the critical path the same length as the iteration period bound.

# **3.2 Recursive algorithms**

All recursive algorithms have a limit on the available parallelism due to the recursive parts of the algorithm. Basically, the algorithm must finish one iteration before the next one can start and it is not possible to introduce pipelining in the loop to speed the computation up. The iteration period bound is expressed as

$$T_{\infty} = \max_{i} \frac{\sum_{\text{Op. in loop } i} T_{L,k}}{N_{i}} = \frac{1}{f_{\max}}$$
(1)

where  $\sum_{\text{Op. in loop } i} T_{L,k}$  is the total latency of the operations in loop i and  $N_i$  is the number of delay elements in loop i [5]. Although it is possible to unfold the algorithm to operate on more than one sample per iteration, the sample rate bound provided by the iteration period bound will not improve. This is illustrated through the following example of an integrator. In Fig. 1(a), a first order integrator is shown. The iteration period bound for this algorithm is  $T_{L,add}$ , i.e., the latency of the adder. An algorithm processing two samples per iteration obtained through unfolding is shown in Fig. 1(b), where the iteration period bound now is  $2T_{L,add}$ . However, as the algorithm is processing two sample per operation, the effective bound on the sample rate is the same,  $T_{L,add}$ . Hence, while it may be beneficial to express the algorithm processing more than one sample per iteration for other reasons, no decrease in sample period bound is obtained.

While the reasoning above is correct assuming discrete operators (adders) with a latency of  $T_{L,add}$ , there are still some effects that may make the final conclusion incorrect. First, the unfolded algorithm can be transformed to change the iteration period bound. An example of that is shown in Fig. 1(c), where the iteration period bound now is halved compared to Fig. 1(b) at the expense of an additional adder. Finally, in Fig. 1(d), pipelining is introduced to make the critical path as long as the iteration period bound, allowing the clock to run at the maximum allowable speed.

While it can be argued that the integrator is a very simplistic example, the general observation holds for any algorithm: expressing the algorithm over more than one sample period will not change the iteration period bound assuming discrete operators, but will enable a higher degree of algorithmic transformations.

In addition, the sample rate will improve slightly even without transforming it. First, the contribution of the register setup-and-hold time will be averaged over more than one clock cycle. In addition, the latency of two cascaded adders (assuming carry-propagation adders such as ripple-carry adders or different carry look-ahead adders) is smaller than  $2T_{L,add}$  when implemented jointly. Hence, even without the transformation, the resulting implementation in Fig. 1(b) will operate with a sample rate higher (but clock frequency lower) than the implementation in Fig. 1(a). It holds for most arithmetic operations that the delay from the inputs to the outputs depends on the particular bit weights. Finally, as there are now more logic to be implemented, the logic synthesis tool has a better chance of optimizing the design, somehow similar to the transformations above but at the logic level.

**Research tasks** There is currently a discrepancy in the literature between the theoretical limits (valid for discrete operators) and the practical synthesis results obtained after the algorithms are unfolded. The objective here is to investigate this issue in detail and derive expressions on how the sample rate increases with unfolding, despite the theory saying it should not. Also, high-level estimates of the possibility to apply transformations should be derived.



Figure 2: Modulator magnitude functions.

Table 1: Modulator noise transfer functions.

While recursive filters are not suitable for very high-speed implementations, because of the iteration period bound, it is of interest to show how far the bounds can be moved. Also, data converters based on sigma-delta modulation are inherently recursive and, therefore, the implementation of unfolded sigma-delta modulators will also benefit from these results. In the latter case, the noise shaping part can be seen as using modulo arithmetic, which opens up a possibility to apply transformations to a much larger extent than previously considered. Since some of the modulators rely heavily on accumulators, the simplistic example above become highly relevant in this scenario. As a preliminary work we have derived expressions for systematic unfolding and transformations of integrators with application in sigma-delta modulators. This builds on our recent work in the area to improve the modulators [16]. Furthermore, we are currently developing optimization formulations that can find low-complexity realizations of sigma-delta modulators with an arbitrary number of delay elements in the critical loop, while taking implementation aspects such as incorporating our closed form approach to scaling<sup>2</sup> [17] into account. An example of complexity optimized modulators with at least 45 dB SNR, 10 times oversampling, and scaled for a 3-bit DAC is shown with magnitude function in Fig. 2 and corresponding noise transfer functions in Table. 1.

# **3.3** Non-recursive algorithms

For non-recursive algorithms, pipelining can be introduced to an arbitrary degree to increase the clock frequency. In addition, the algorithms can be unfolded to an arbitrary degree such that there are no algorithmic issues limiting the obtainable sampling rate.

The unfolding will lead to some other relevant effects though. These are different depending on the algorithm, but for the general cases of FIR filters and FFTs they will be illustrated below. For other classes of algorithms it is expected that similar effects can be obtained, although these are scarcely reported in the literature.

**Parallel FIR filters** To describe FIR filter algorithms processing several samples per iteration, it is useful to consider the polyphase representation. The *M*-fold polyphase representation of an *N*th-order FIR transfer function can be written as

$$H(z) = \sum_{n=0}^{N} h(n) z^{-n} = \sum_{m=0}^{M-1} H_m(z^M) z^{-m}$$
(2)

where

$$H_m(z^M) = \sum_{k=0}^{K-1} h(kM+m) z^{-kM}$$
(3)

is the *m*th polyphase branch, i.e., every *M*th coefficient of the FIR impulse response starting at value *m*. Using a similar denomination, FIR filters operating on two and three samples per iteration are shown in Figs. 3(a) and 3(b), respectively. As shown, considering the definition above, the result is four FIR filters with half the length. However, it is possible to share some parts among the subfilters, e.g., only two (three) sets of delays are required for storing samples, not four (nine) as it may seem.

<sup>&</sup>lt;sup>2</sup>This is often considered to be "stability" in many related works. However, the source of this "in-stability" is overflow, and, hence, by avoiding overflow in all nodes, the modulator will behave as expected and be "stable".



Figure 3: FIR filters processing (a) two and (b) three samples per iteration based on polyphase decomposition. Reduced complexity FIR filters processing (c) two and (d) three samples per iteration based on polyphase decomposition.

By utilizing results originating from Karatsuba multipliers [18] and later on generalized to different types of polynomial multiplications (convolutions) it is possible to reduce the number of subfilters, and therefore computations, to four and six as shown in Figs. 3(c) and 3(d), respectively [19]. Asymptotically, these filters reduce the number of multiplications per sample as  $\left(\frac{3}{4}\right)^{\log_2 M}$ , where *M* is the number of parallel streams. Still, there are some drawbacks with the approach. For example, the adders before and after the actual filters will add to the complexity. In addition, the number of delay elements increases as  $\left(\frac{4}{3}\right)^{\log_2 M}$ .

In a recent work, we derived an alternative FIR filter architecture which can roughly half the number of multiplications, and, hence, is suitable, e.g., for implementing the sub filters in Figs. 3(c) and 3(d) [20].

**Parallel FFT architectures** The classic result of the fast Fourier transform (FFT) is that the number of complex multiplications to compute a discrete Fourier transform (DFT) is reduced from  $N^2$  to  $\frac{N}{2} \log_2 N$ , when N is a power of two [5]. However, for high-speed implementation, it is hard to find suitable architectures that have full utilization of the complex multipliers as these are ordered in time in a way that limits the reuse. As an example, for most standard radix-2 one sample per cycle architectures, such as the single-delay feedback (SDF) pipelined FFT, the utilization of each complex multiplier is 50%. Using one half the number of multiplications, which clearly is theoretically possible, will require a significant control and memory overhead, leading to that the simpler architectures with more multipliers are preferred. Instead, algorithms that results in that some of the multipliers can be simplified, e.g., to only multiplying with 1 or -j, at the expense of higher utilization of the other multipliers, have been proposed [21].

Once these architectures are parallelized, it is not certain that every complex multiplier is replicated since some of them now only process multiplications by 1. Consider for example, the architecture in Fig. 4(a) based on the standard radix-2 FFT algorithm, processing eight samples per iteration. Each  $\otimes$ corresponds to a complex multiplier, 16 in total, while each  $\diamond$  corresponds to a trivial multiplier, only multiplying with 1 or -j, four in total. By using another algorithm, as shown in Fig. 4(b), an architecture with only 12 general complex multipliers, but 12 trivial multipliers is obtained [22]. The latter architecture should in most cases be beneficial. Note that for the single sample per iteration architecture, the radix-2 algorithm would result in four complex multipliers and one trivial, while the radix-2<sup>2</sup> requires two general and three trivial. While the rank is still the same, the ratio has changed. This shows that one needs to consider the complexity at the appropriate degree of parallelism. There exists a large class of different FFT algorithms to be considered [23], as other complex multipliers can be simplified as well, when the number of coefficients is small [24].

**Research tasks** The research tasks here will involve systematic investigations of the trade-off between parallelism and power for non-recursive algorithms. Primarily, FIR filters and FFTs will be considered, as these constitute a major part of the processing in most digital front-ends. However, other interesting algorithms will also be considered, e.g., adaptive FIR filters, where the adaptation loop will introduce additional constraints. In this context it should be noted that it is possible to perform FIR filtering (convolutions) using FFTs, but that most complexity comparisons found in the literature are based on the number of real



Figure 4: FFT architectures processing eight samples per iteration using (a) a standard radix-2 algorithm and (b) a radix- $2^2$  algorithm resulting in fewer complex multipliers.

multiplications using the classic FIR and FFT complexity equations. These are not correct for high-speed operation since, as discussed above, the number of multipliers is a more relevant estimate of complexity. Some multipliers in the FFT can be simplified<sup>3</sup>, and the number of multiplications per sample decreases with parallelism if reduced complexity FIR filter structures are used. In addition, in the two commonly used schemes for FIR filtering through FFTs, either some of the inputs are zero or some of the outputs are ignored. Hence, the computational complexity can be reduced considering this.

A solid comparison of FFT-based FIR filtering with the reduced complexity FIR filter techniques will be one of the first tasks in the project. This will be a very useful result when implementing high-speed convolutions. To obtain a fair comparison, similar finite word length properties should be achieved for both approaches. Despite FFT-based FIR filtering being around for half a century, almost no work have been done on the finite word length properties. In a preliminary work, we have e.g. shown that with finite word lengths, the FFT-based FIR filter will have a time varying filter response [25]. Additional work will include FIR filter design taking this into account. In addition, our previous work on parallel FFT architectures [22] will be extended. In preliminary results, we have implemented 4096-points FFTs reaching 10.9 GSa/s and 65536-point FFTs reaching 7.4 GSa/s in single FPGAs [26].

The complexity reduced FIR filters, as outlined in Figs. 3(c) and 3(d), will be reconsidered. Here, there are two main issues that will be considered. The first is that there is a lower bound on the number of subfilters which is 2M - 1. However, for most non-power-of-two factors this involves division by constant integers, something that is in general not considered suitable in a hardware implementation. The focus of this part is to apply different techniques to get around this and also to evaluate the actual cost of this compared to the savings. As part of this, we are currently working on systematic methods to derive highly parallel structures from lower order in a different way than the commonly used factorization. The results are promising, especially for prime factors. The second is to systematically derive filter structures based on integer linear programming design. This is primarily driven by the fact that for linear-phase FIR filters the coefficients symmetry is in general not retained in the subfilters using the standard structures. This problem has partly been considered in [27], however, only in an ad-hoc way and for few values of M.

# **3.4** Parallel multiplication

For an FIR filter, either the input is multiplied by all the coefficients, as in the transposed direct form FIR filter, or the delayed inputs are multiplied by the coefficients and summed up, as in the direct form FIR filter. In both cases it is possible to share redundant parts<sup>4</sup>, most often utilized when the coefficients are constant and the multipliers replaced by shift, adders, and subtracters, so called *multiplierless* implementation [28, 29]. For polyphase FIR filters, the redundancy utilization can be increased by considering all filter simultaneously, resulting in a constant matrix-vector multiplication [30, 31].

When the coefficients are not given, for example in adaptive filters, or in chromatic dispersion filters

7

<sup>&</sup>lt;sup>3</sup>This is also the case for FIR filters if the coefficients are known beforehand and as discussed in Section 3.4.

<sup>&</sup>lt;sup>4</sup>While this is easiest to realize for the transposed direct form structure, one can note that a direct form structure with identical complexity is obtained through transposition.

| <b>Decimation factors</b> | Multiplication rate (multiplier-based) | Addition rate (multiplierless) |
|---------------------------|----------------------------------------|--------------------------------|
| [8]                       | 10.56                                  | 29.88                          |
| [24]                      | 6.69                                   | 25                             |
| [4 2]                     | 5.13                                   | 18.63                          |
| [2 2 2]                   | 3.25                                   | 22                             |

Table 2: Multiplication and addition rates per output sample for a multiplier-based and a multiplierless implementation, respectively, for a decimation by 8 FIR filter using different partitioning.

where different fiber lengths results in different filter coefficients, the above mentioned techniques can not be applied. However, it may be beneficial to use high-radix multipliers, where small multiples (0 to N - 1for a radix-N multiplier) are computed once and then shared among the multipliers. The individual part of the multiplier will now decrease in size, as the number of partial terms to sum up is reduced by  $\log_2 N$ . This has been proposed for single FIR filters in [32]. However, the best choice of radix will depend on the filter length, something that is not considered in the previous work.

Another aspect to which has not yet been considered is the operator trade-off for multiplierless implementation of multi-stage DSP, e.g., multi-stage interpolation and decimation filters for oversampled data converters. In is well established that the multiplication rate is reduced by performing this operation is multiple stages [33]. However, more stages means less redundancy within each stage. In a preliminary work, we have discovered that the optimal multiplication rate does not correspond to the optimal addition rate in the multiplierless implementation. This is illustrated through the example in Table 2, where it is clear that different decimation stage partitioning gives optimal operation counts.

A similar effect happens when, e.g., the structures in Figs. 3(a) and Fig. 3(c) are compared. For Fig. 3(a), all multipliers can be jointly simplified as discussed above, while for Fig. 3(c), the three subfilters must be considered separately. Hence, despite fewer multipliers in Fig. 3(c), the total number of adders in a multiplierless implementation may still be smaller in Fig. 3(a) as shown in our preliminary work [31].

**Research tasks** The radix-filter length trade-off should be properly analyzed. In addition, it is possible to use Booth-encoding, reducing the number of partial results to be computed at the expense of more complicated encoding logic. In the larger context, the applicability of this should be considered for the parallel FIR filters above. For the traditional polyphase case, M sets of multiplications must be computed, one for each input, while for the reduced complexity case,  $\left(\frac{4}{3}\right)^{\log_2 M}$  sets must be used. Hence, this provides an additional dimension in the parallelism-power analysis. The work on multi-stage interpolation and decimation filters will be extended to related situations.

# 3.5 Research summary

To end up with efficient implementations it is important to simultaneously consider the impact of parallelism on algorithms, architectures, and arithmetic. While many of the results initially will be obtained based on high-level complexity estimations, it is important to obtain accurate power figures. This will be done using simulation tools to a large extent, but for relevant cases, we will also design and manufacture integrated circuits to further substantiate our findings. The comparison in Table 3 shows a summary of which aspects will be taken into account and how they relate to the currently most complete design methodology [3].

# 3.6 Environment

**Research environment** The Division of Computer Engineering is headed by the PI, Dr. Oscar Gustafsson. The PI together with the co-PI Prof. Håkan Johansson from the Division of Communication Systems have a long history in the area of efficient algorithms and implementations for DSP algorithms, primarily digital filters and filter banks. Also, arithmetic circuits is a major research topic in the recent years, primarily lead by the PI. Dr. Mario Garrido adds significant expertise in the design and implementation of FFT architectures and related building blocks. Currently, the PI supervise three PhD students and the co-PI one PhD student in related areas.

| Transformation                                                     | Architecture impact                                                                                 | Considered in [3]? | Considered<br>in project? |
|--------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------|--------------------|---------------------------|
| Unfolding a non-recursive algorithm a factor $M$                   | M samples per iteration, $M$ times the operators                                                    | Yes                | Yes                       |
| Unfolding a recursive algorithm a factor $M$                       | M samples per iteration, $M$ times the operators, no change in sample period bound                  | Yes                | Yes                       |
|                                                                    | Non-additive latency of cascaded operators, move op-<br>erations out of the loops                   | No                 | Yes                       |
| Unfolding a time-multiplexed ar-<br>chitecture a factor $M$        | M times higher throughput, $M$ times the operators                                                  | Yes                | Yes                       |
|                                                                    | Factor $M$ fewer configurations for each operator, operator complexity based on configuration modes | No                 | Yes                       |
| Deriving $M$ parallel algorithms based on computational redundancy | M samples per iteration, $< M$ times the operators                                                  | No                 | Yes                       |
| Architecture dependent optimized constant multiplications          | Varying multiplier complexity                                                                       | No                 | Yes                       |

Table 3: Impact of parallelism and algorithms transformation on architecture level.

**Relation to other funding/applications** Earlier VR-grants will to some extents provide results to the current projects. In addition, the PI had a project by the Linköping University research organization CENIIT<sup>5</sup> on efficient implementation taking the special properties of FPGAs into account. This will strengthen the proposed project, while dealing with a different topic. The PI was awarded a "career contract" by Linköping University, which ended 2013. The PI received directed funding from the European Space Agency (ESA) to summarize the relevant work related to on-board processing for satellite communication in 2013.

*National and international collaboration* The PI and co-PI have a broad network of international contacts from, e.g., UCLA, Nanyang Technological University, Technical University of Denmark, University of Westminster, Tampere University of Technology, and IRISA. Nationally, collaboration with research groups at Chalmers is ongoing in the area of optical communication. There is also collaboration with Lund University through the ELLIIT strategic research area.

# 4 Significance

This project will substantially increase the knowledge of how to design and implement the digital front-end signal processing with very low power consumption for current and emerging high-speed communication standards and related applications, targeting up to tens or hundreds of GSa/s data rates. In addition it will provide a deeper understanding of how the computational redundancy is affected by parallelism. These results will be enablers for future wireless terminals and other high-speed communication equipment with low energy consumption, which in turn is advantageous both for the user (long battery life time and/or less cooling requirements) and for the environment (less energy consumed). The results will also serve as guidelines when designing high-speed signal processing systems, and even standards, providing knowledge about which design parameters are feasible and appropriate.

In addition, it is expected that new architectures and algorithms for the considered problems will be proposed, with a focus on utilizing the additional redundancy obtained from parallelism as well as formulating optimization problems that directly consider the high-speed properties.

# 5 Preliminary results

Preliminary results are discussed throughout the proposal, e.g., Fig. 2 and Tables 1 and 2 illustrate results from ongoing work.

<sup>&</sup>lt;sup>5</sup>http://www.isy.liu.se/ceniit/

# References

- [1] A. P. Chandrakasan and R. W. Brodersen, Low power digital CMOS design. Kluwer Academic Pub, 1995.
- [2] R. Gonzalez, B. M. Gordon, and M. A. Horowitz, "Supply and threshold voltage scaling for low power CMOS," *IEEE J. Solid-State Circuits*, vol. 32, no. 8, pp. 1210–1216, 1997.
- [3] D. Markovic, "A power/area optimal approach to VLSI signal processing," Ph.D. dissertation, Univ. California, Berkeley, 2006.
- [4] "International technology roadmap for semiconductors," Tech. Rep., 2013. [Online]. Available: http://www.itrs.net/
- [5] L. Wanhammar, DSP Integrated Circuits. Academic press, 1999.
- [6] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. Wiley-Interscience, 1999.
- [7] (2015, Mar.) Wireless gigabit alliance. [Online]. Available: http://www.wigig.org/
- [8] (2015, Mar.) WirelessHD. [Online]. Available: http://www.wirelesshd.org/
- [9] T. Ahmed, M. Garrido, and O. Gustafsson, "A 512-point 8-parallel pipelined feedforward FFT for WPAN," in *Proc. Asilo-mar Conf. Signals Syst. Comput.*, 2011, pp. 981–984.
- [10] S. J. Savory, "Digital filters for coherent optical receivers," Opt. Express, vol. 16, no. 2, pp. 804–817, 2008.
- [11] A. Eghbali, H. Johansson, O. Gustafsson, and S. Savory, "Optimal least-squares FIR digital filters for compensation of chromatic dispersion in digital coherent optical receivers," J. Lightw. Technol., vol. 32, no. 8, pp. 1449–1456, 2014.
- [12] H. Johansson, "A polynomial-based time-varying filter structure for the compensation of frequency-response mismatch errors in time-interleaved ADCs," *IEEE J. Sel. Topics Signal Process.*, vol. 3, no. 3, pp. 384–396, 2009.
- [13] A. Parsons et al., "Petaop/second FPGA signal processing for SETI and radio astronomy," in Proc. Asilomar Conf. Signals Syst. Comput., 2006, pp. 2031–2035.
- [14] N. Sane, J. Ford, A. I. Harris, and S. S. Bhattacharyya, "Prototyping scalable digital signal processing systems for radio astronomy using dataflow models," *Radio Sci.*, vol. 47, no. 3, 2012.
- [15] T.-H. Yu, C.-H. Yang, D. Cabric, and D. Markovic, "A 7.4-mW 200-MS/s wideband spectrum sensing digital baseband processor for cognitive radios," *IEEE J. Solid-State Circuits*, vol. 47, no. 9, pp. 2235–2245, 2012.
- [16] N. Afzal, O. Gustafsson, and J. J. Wikner, "Reducing complexity and power of digital multi-bit error-feedback  $\Delta\Sigma$ -modulators," *IEEE Trans. Circuits Syst. II*, vol. 61, no. 9, pp. 641–645, Sept 2014.
- [17] —, "On scaling and output cardinality of digital multi-bit error-feedback modulators," *IEEE Trans. Circuits Syst. II*, 2015, provisionally accepted.
- [18] A. Karatsuba and Y. Ofman, "Multiplication of multidigit numbers on automata," in Soviet physics doklady, vol. 7, 1963.
- [19] D. A. Parker and K. K. Parhi, "Low-area/power parallel FIR digital filter implementations," *J. VLSI Signal Process. Syst.*, vol. 17, no. 1, pp. 75–92, 1997.
- [20] O. Gustafsson and A. Ehliar, "Low-complexity general FIR filters based on Winograd's inner product algorithm," in *Proc. IEEE Int. Symp. Circuits Syst.*, 2013.
- [21] S. He and M. Torkelson, "A new approach to pipeline FFT processor," in Proc. Int. Parallel Process. Symp., 1996.
- [22] M. Garrido, J. Grajal, M. A. Sanchez, and O. Gustafsson, "Pipelined radix-2<sup>k</sup> feedforward FFT architectures," *IEEE Trans. VLSI Syst.*, vol. 21, no. 1, pp. 23–32, Jan. 2013.
- [23] F. Qureshi, "Optimization of rotations in FFTs," Ph.D. dissertation, Linköping, 2012.
- [24] M. Garrido, F. Qureshi, and O. Gustafsson, "Low-complexity multiplierless constant rotators based on combined coefficient selection and shift-and-add implementation (CCSSI)," *IEEE Trans. Circuits Syst. I*, vol. 61, no. 7, pp. 2002–2012, 2014.
- [25] H. Johansson and O. Gustafsson, "On frequency-domain implementation of digital FIR filters," in *Proc. DSP Workshop*, 2015.
- [26] M. Garrido, M. Acevedo, A. Ehliar, and O. Gustafsson, "Challenging the limits of FFT performance on FPGAs," in *Proc. Int. Symp. Integrated Circuits*, 2014, pp. 172–175.
- [27] Y.-C. Tsao and K. Choi, "Area-efficient parallel FIR digital filter structures for symmetric convolutions based on fast FIR algorithm," *IEEE Trans. VLSI Syst.*, vol. 20, no. 2, pp. 366–371, 2012.
- [28] O. Gustafsson, "Lower bounds for constant multiplication problems," *IEEE Trans. Circuits Syst. II*, vol. 54, no. 11, pp. 974–978, Nov. 2007.
- [29] K. Johansson, "Low power and low complexity shift-and-add based computations," Ph.D. dissertation, Linköping University, 2008.
- [30] O. Gustafsson and A. G. Dempster, "On the use of multiple constant multiplication in polyphase FIR filters and filter banks," in *Proc. IEEE Nordic Signal Process. Symp.*, 2004, pp. 53–56.
- [31] A. Eghbali, O. Gustafsson, H. Johansson, and P. Löwenborg, "On the complexity of multiplierless direct and polyphase FIR filter structures," in *Proc. Int. Symp. Image Signal Process. Analysis*, 2007, pp. 200–205.
- [32] R. Mahesh and A. Vinod, "New reconfigurable architectures for implementing FIR filters with low complexity," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 29, no. 2, pp. 275–288, 2010.
- [33] R. Crochiere and L. Rabiner, "Interpolation and decimation of digital signals—a tutorial review," *Proc. IEEE*, vol. 69, no. 3, pp. 300–331, Mar. 1981.

# My application is interdisciplinary

 $\Box$ 

An interdisciplinary research project is defined in this call for proposals as a project that can not be completed without knowledge, methods, terminology, data and researchers from more than one of the Swedish Research Councils subject areas; Medicine and health, Natural and engineering sciences, Humanities and social sciences and Educational sciences. If your research project is interdisciplinary according to this definition, you indicate and explain this here.

Click here for more information

Scientific report

Scientific report/Account for scientific activities of previous project

## **Budget and research resources**

# **Project staff**

Describe the staff that will be working in the project and the salary that is applied for in the project budget. Enter the full amount, not in thousands SEK.

Participating researchers that accept an invitation to participate in the application will be displayed automatically under Dedicated time for this project. Note that it will take a few minutes before the information is updated, and that it might be necessary for the project leader to close and reopen the form.

# Dedicated time for this project

| Role in the project                       | Name             | Percent of full time |
|-------------------------------------------|------------------|----------------------|
| 1 Applicant                               | Oscar Gustafsson | 20                   |
| 2 Other personnel without doctoral degree | Ny doktorand     | 80                   |
| 3 Participating researcher                | Håkan Johansson  | 10                   |

# Salaries including social fees

| Role in the project                       | Name             | Percent of salary | 2016 | 2017 | 2018 | 2019 | Total |
|-------------------------------------------|------------------|-------------------|------|------|------|------|-------|
| 1 Applicant                               | Oscar Gustafsson | 20                | 220  | 227  | 234  | 241  | 922   |
| 2 Participating researcher                | Håkan Johansson  | 10                | 129  | 132  | 136  | 140  | 537   |
| 3 Other personnel without doctoral degree | Ny doktorand     | 80                | 426  | 439  | 452  | 465  | 1,782 |
| Total                                     |                  |                   | 775  | 798  | 822  | 846  | 3,241 |

# Other costs

Describe the other project costs for which you apply from the Swedish Research Council. Enter the full amount, not in thousands SEK.

| remises                        |             |             |      |      |      |      |       |
|--------------------------------|-------------|-------------|------|------|------|------|-------|
| Type of premises               |             | 2016        | 2017 |      | 2018 |      | 2019  |
| unning Costs                   |             |             |      |      |      |      |       |
| Running Cost                   |             | Description | 2016 | 2017 | 2018 | 2019 | Total |
| 1 Konferens- och publikationsk | ostnader    |             | 40   | 60   | 60   | 60   | 220   |
| 2 Kretstillverkning            |             |             | 0    | 80   | 80   | 80   | 240   |
| B Dator och mjukvara           |             |             | 40   | 0    | 0    | 0    | 40    |
| Total                          |             |             | 80   | 140  | 140  | 140  | 500   |
| epreciation costs              |             |             |      |      |      |      |       |
| Depreciation cost              | Description |             | 2016 | 2017 | 20   | 18   | 2019  |

## **Total project cost**

Below you can see a summary of the costs in your budget, which are the costs that you apply for from the Swedish Research Council. Indirect costs are entered separately into the table.

Under Other costs you can enter which costs, aside from the ones you apply for from the Swedish Research Council, that the project includes. Add the full amounts, not in thousands of SEK.

The subtotal plus indirect costs are the total per year that you apply for.

| Specified costs                | 2016  | 2017  | 2018  | 2019  | Total, applied | Other costs | Total cost |
|--------------------------------|-------|-------|-------|-------|----------------|-------------|------------|
| Salaries including social fees | 775   | 798   | 822   | 846   | 3,241          |             | 3,241      |
| Running costs                  | 80    | 140   | 140   | 140   | 500            |             | 500        |
| Depreciation costs             |       |       |       |       | 0              |             | 0          |
| Premises                       |       |       |       |       | 0              |             | 0          |
| Subtotal                       | 855   | 938   | 962   | 986   | 3,741          | 0           | 3,741      |
| Indirect costs                 | 299   | 328   | 337   | 342   | 1,306          |             | 1,306      |
| Total project cost             | 1,154 | 1,266 | 1,299 | 1,328 | 5,047          | 0           | 5,047      |

| E   |  |         | ed budget |
|-----|--|---------|-----------|
|     |  |         |           |
| LAP |  | proposi |           |

Briefly justify each proposed cost in the stated budget.

## Explanation of the proposed budget\*

Lönekostnaderna täcker projektledaren, biträdande projektledaren samt en doktorand (som kommer att rekryteras). Publikationskostnader täcker i snitt två konferensbidrag och en tidskriftsartikel per år. Tre testkretsar är ett realistiskt mål för att kunna verifiera arbetet. Vid uppstart och rekrytering av en ny doktorand tillkommer kostnader för att iordningsställa arbetsplatsen med dator och liknande. Lokalkostnader hanteras genom att räkna in dessa i de indirekta kostnader som läggs på.

# Other funding

Describe your other project funding for the project period (applied for or granted) aside from that which you apply for from the Swedish Research Council. Write the whole sum, not thousands of SEK.

## Other funding for this project

| Fundor | Applicant/project leader | Type of grant | Dog no or oquiy  | 2016 | 2017 | 2018 | 2019 |
|--------|--------------------------|---------------|------------------|------|------|------|------|
| Funder | Applicant/project leader | Type of grant | Reg no or equiv. | 2010 | 2017 | 2010 | 2019 |

# CV and publications

# cv

# **CURRICULUM VITAE**

### OSCAR GUSTAFSSON

| Address:             | Linköping University<br>Dept. of Electical Engineering (ISY) | Phone:<br>E-mail: | +46-13-284059<br>oscar.gustafsson@liu.se |
|----------------------|--------------------------------------------------------------|-------------------|------------------------------------------|
|                      | SE–581 83 Linköping                                          | L-man.<br>www:    | www.da.isy.liu.se                        |
|                      | Sweden                                                       | ** ** ** •        | www.da.isy.iiu.se                        |
| <b>Publications:</b> | http://scholar.google.com/                                   | citatio           | ns?user=YIFravQAAAAJ                     |

## **REQUIRED INFORMATION**

- 1. M.Sc. (Civilingenjör) in Applied Physics and Electrical Engineering, Linköping University, Sweden. April 1998.
- 2. Ph.D. in Electronics Systems, Linköping University, Sweden. Sept. 2003
- 3. –
- 4. Docent in Electronics Systems, Linköping University, Sweden. April 2008
- 5. Associate Professor/Senior Lecturer (universitetslektor), July 2009 , 50% research, 10% administration, 40% teaching
- 6. Assistant professor (forskarassistent), Oct. 2003 June 2009
- 7. Jan. 2007 July 2007 (part time, about three months in total), Feb. 2003 Oct. 2003 (part time, about five months in total)
- Examined PhD students: Kenny Johansson, Oct. 2008, Anton Blad, Sept. 2011, Fahad Qureshi, Mar. 2012. Hosted post-doc: Mario Garrido, June 2010–June 2012, Kevin Cushon, March 2014–Aug. 2014

## **EDUCATION**

| April 2008              | Docent (equiv. to Associate Professor) in    | Linköping University, Sweden       |
|-------------------------|----------------------------------------------|------------------------------------|
|                         | Electronics Systems                          |                                    |
| April 1998 - Sept. 2003 | Ph.D. in Electronics Systems                 | Linköping University, Sweden       |
| Sept. 1995 – April 1996 | Exchange student (M.Sc. in Digital Systems   | Heriot-Watt University, Edinburgh, |
|                         | Eng.)                                        | United Kingdom                     |
| Sept. 1992 – April 1998 | M.Sc. (Civilingenjör) in Applied Physics and | Linköping University, Sweden       |
|                         | Electrical Engineering                       |                                    |

## **APPOINTMENTS**

| July 2014 –             | Head (Avdelningschef) of the Division of Computer Engineering, Linköping University       |
|-------------------------|-------------------------------------------------------------------------------------------|
| Jan. 2012 - Dec. 2014   | Vice head (Pro-prefekt) of the Department of Electrical Engineering, Linköping University |
| Sept. 2009 - June 2014  | Head (Avdelningschef) of the Division of Electronics Systems, Linköping University        |
| July 2009 –             | Associate Professor/Senior Lecturer (Universitetslektor), Linköping University            |
| July 2009 – Aug. 2009   | Guest researcher, Nanyang Technological University, Singapore                             |
| Oct. 2003 – June 2009   | Assistant Professor (Forskarassistent), Linköping University                              |
| Jan. 2007 – July 2007   | Paternal leave, (part time, about three months in total)                                  |
| Feb. 2003 - Oct. 2003   | Paternal leave, (part time, about five months in total)                                   |
| April 1998 - Sept. 2003 | Ph.D. Student, Linköping University                                                       |

## **RESEARCH INTERESTS**

- Design and implementation of DSP and communication algorithms
- Digital arithmetic
- Optimization for electrical engineering

# **RESEARCH FUNDING**

- Energy efficient arithmetic, Research associate (forskarassistent) grant, Vetenskapsrådet, 2004–2007, 810 kSEK/y.
- Algorithm-hardware co-design for memory and communication dominated applications, project grant, Vetenskapsrådet, 2007–2009, 675 kSEK/y
- Practical algorithms for highly efficient cooperative transmission in wireless networks, Vinnova, 2008/07–2011/06, 1500 kSEK/y (as Co-PI)
- Contract research, A2B Electronics AB, 2011, 200 kSEK
- Statistical signal processing for communication receivers, project grant, Vetenskapsrådet, 2009–2011, avg. 1500 kSEK/y (as Co-PI)
- Techniques for nonlinear digital-to-analog conversion, project grant, Vetenskapsrådet, 2009–2011, avg. 875 kSEK/y
- Funding for hosting a post-doctoral student, ELLIIT (LIU), 2010+2011, 400 + 400 kSEK
- Algorithm-hardware co-design for FPGAs, project grant, CENIIT (LiU), 2007–2012, 475 kSEK/y
- Techniques for nonlinear digital-to-analog converters, project grant, Vetenskapsrådet, 2010–2012, avg. 788 kSEK/y
- Career contract, LiU, 2009–2013, 525 kSEK/y (1100 kSEK for 2013)
- Techniques for on-board digital signal processing, European Space Agency, 25 kEUR, 2013

# **SCIENTIFIC OUTPUT**

- 4 book chapters
- 26 journal papers
- 124 papers in international conferences
- 1 patent application
- 1 publicly released software package

## **VOLUNTEER ACTIVITIES**

- Senior Member of IEEE
- Member of IEICE
- Associate editor for IEEE Transactions on Circuits and Systems Part II: Express Briefs 2010–2013, Integration, the VLSI Journal, 05/2011–
- Guest editor of Journal of Electrical and Computer Engineering special issue on "Hardware Implementation of Digital Signal Processing Algorithms"
- Member of IEEE Circuits and Systems VLSI Systems and Applications (VSA) Technical Committee, 5/07–, and Digital Signal Processing (DSP) Technical Committee 10/07–
- Treasurer of Sweden section of IEEE Solid-State Circuits Society 2009-2012
- TPC co-chair: PrimeAsia 2010.
- Finance chair: ECCTD 2011
- Special session chair: ECCTD 2015
- Track chair: Asilomar 2008, CrownCom 2010, Eusipco 2010, 2015
- Student paper contest chair: Asilomar 2011
- International coordinator: ISCAS 2012
- TPC member: Norchip 2006, 2009, 2011, ISCAS 2008–, PATMOS 2008–, PrimeAsia 2009, ICGCS 2010, ICECS 2010, ICCSP 2011, Eusipco 2011, INMIC 2011
- Student paper contest jury member: Asilomar 2010
- Editor for the Proceedings of National Conference on Radio Sciences (RVK), 2005
- Organizer of special conference session: CrownCom 2010
- Co-organizer of special conference session: ISCAS 2005
- Conference session chair: ISCAS 2005, 2008–2013, 2015, Norchip 2006, Asilomar 2008, CrownCom 2010, ECCTD 2011
- Reviewer of around 130 papers for 32 different international journals and regular reviewer for the major circuits and systems conferences: ISCAS, ICECS, ECCTD, etc.

#### Håkan Johansson, CV

### Håkan Johansson's Curriculum Vitae

Prof. Johansson's research encompasses theory and design of efficient signal processing systems, mainly for communication applications. During the past decade, Prof. Johansson has developed many different signal processing algorithms for various purposes, including filtering, sampling rate conversion, signal reconstruction, and parameter estimation. He has developed new estimation and compensation algorithms for errors in analog circuits such as compensation of mismatch errors in time-interleaved analog-to-digital converters and mixers. He is one of the founders of the company Signal Processing Devices Sweden AB that sells this type of advanced signal processing.

#### **Degrees and Employments**

- 1. M.Sc. degree in Computer Science and Engineering (Civ.ing, D-linjen), 1995.
- 2. Ph.D. (Tekn. Dr.) degree in Electronics Systems (Elektroniksystem), 1998. Title: Synthesis and realization of high-speed recursive digital filters Supervisor: Prof. Lars Wanhammar
- 3. Post doctoral position at Tampere University of Technology, 1998-1999.
- 4. Docent degree in Electronics Systems, Sept. 2001.
- 5. Current permanent position (100%): Prof. at Dept. EE Linköping University, from July 2004.
- 6. Previous positions:
  - Part-time positions: System developer at the company Signal Processing Devices Sweden AB, different periods, 2007-2013.
  - Ass. Prof. (universitetslektor) at Dept. EE Linköping University, July 1999-June 2004.
  - Post doctoral position at Tampere University of Technology, 1998-1999.
  - Ph. D. student at Dept. EE Linköping University, 1995-1998.
- 7. Interruption in research
  - A total of three years work in industry 2007-2013.
  - Parental leave 75% during 3 months november 2004-januari 2005 and 50% during 3 months february-may 2007.
- 8. Main supervisor of 6 graduated Ph.D. students
  - Muhammad Abbas, February 2012, Zaka Ullah Sheikh, March 2012
  - Amir Eghbali, December 2010, Mattias Olsson, June 2008
  - Linnea Rosenbaum, June 2007, Per Löwenborg, December 2002

#### **Selected previous research grants** (Approximate USD numbers based on 1 USD = 8 SEK)

Principle Investigator (PI):

- Swedish Research Council, "Modulation-sequence based analog-to-digital conversion", 2010-2012. 280,000 USD.
- Swedish Research Council, "Efficient and flexible digital signal processing algorithms", 2006-2008. 240,000 USD.

Co-PI:

- Swedish Foundation for Strategic Research, grant holder Erik. G. Larsson, Div. of Communication Systems, Dept. of EE, LiU, "Algorithms and architectures for baseband signal processing", 2008-2011. In total 1,040,000 USD, Prof. Johansson's part 190,000 USD.
- The Swedish Defence Materiel Administration (FMV, Program TAIS), "Components and signal processing for energy starved microwave sensors", cooperation between Chalmers (grant holder Staffan Rudner), LiU, FOI, and SAAB Microwave Systems, 2006-2009. In total 1,250,000 USD, Prof. Johansson's part 305,000 USD.
- Swedish Research Council, grant holder Christer Svensson, Div. of Electronic Devices, Dept. of EE, LiU, "Direct RF sampling for flexible radio architectures", 2006-2008. In total 560,000 USD, Prof. Johansson's part 200,000 USD.

### Publications

- 4 books
- 61 international journal papers (3 invited).
- 122 international conference papers (3 invited).
- 13 national conference papers.
- 4 book chapters (invited)

#### Patents

• 7 patents (plus 2 pending)

#### Awards

- Co-author of best paper in J. Circuits, Syst., Comp., Special Issue on Frequency-Response Masking Technique, 2003.
- Co-author of best paper at *IEEE Nordic Signal Processing Symp.* 2002.
- Co-author of best student paper at IEEE Midwest Symp. 1999.

### **Editorial commitments**

- Associate Editor of IEEE Trans. on Circuits and Systems-I, 2008-2009, and 2014-2015.
- Area Editor of the Elsevier J. Digital Signal Processing, since 2013 Associate Editor of IEEE Trans. Signal Processing, 2006-2008. Associate Editor of IEEE Signal Processing Letters, 2004-2007.

- Associate Editor of IEEE Trans. on Circuits and Systems-II, 2000-2001 and 2007.

### Award committee commitments

Committee member for the selection of the IEEE Circuits and Systems Society Guillemin-Cauer and Darlington best paper awards, 2011-2013.

#### Selected conference program committees

- Member of the ComSoc Signal Processing for Communications and Electronics Technical Committee (SPCE TC), since 2014.
- Member of IEEE Int. Symp. Circuits Syst. (ISCAS), DSP track committee, since 2000.
- Member of the Technical Program Committee of the IEEE Wireless Communications and Networking Conference (WCNC), 2013-2014.
- Member of the Technical Program Committee of the IEEE Global Communications Conference (GLOBECOM), 2009-2015.
- Member of the Technical Program Committee of the IEEE International Conference on Communications (ICC), 2011-2013, 2015.
- Member of the Program Committee of the International Symposium on Image and Signal Processing and Analysis (ISPA), 2007, 2009, 2012, 2015.
- Technical Program Committee Reviewer for the 2012 IEEE Symposium on Industrial Electronics and Applications (ISIEA), 2012.
- Member of the Technical Program Committee of the First IEEE International Conference on Communications in China: Signal Processing for Communications (ICCC'12 - SPC), 2012.
- Member of the Technical Program Committee of the Signal Processing and Applications (SPA) track of Mosharaka International Conference on Communications and Signal Processing (MIC-CSP), 2011, 2012.
- Member of the Technical Program Committee for I-WASP 2011, International Workshop on Applications of Signal Processing (I-WASP 2011), 2011.
- Member of the Technical Program Committee of the 10th International Symposium on Communications and Information Technologies (ISCIT), 2010.
- Track co-chair for the IEEE International Conference on Electronics, Circuits, and Systems (ISECS), 2008
- Member of the Technical Program Committee of the EURASIP European Conference on Signal Processing (EUSIPCO) 2008.
- Member of the Technical Committee of the 18th European Conference on Circuit Theory and Design (ECCTD), 2007.

## **Opponent/evaluator/expert commitments**

- 20 thesis opponent/evaluator/reviewer assignments, 2000-2013.
- Reviewer for the Italian Research and University Evaluation Agency (ANVUR), Sept. 2012- Feb. 2013. "Evaluation of the Italian research system for the period 2004-2010".

#### **Referee commitments**

- Reviewer of more than 200 papers for international journals during 2000-2015.
- Responsible of review processes (40 papers) for IEEE Int. Symp. Circuits Syst., 2001–2006.
- Reviewer of more than 100 papers for international conferences during 2000-2015.

#### Teaching

- Undergraduate courses (Linköping University):
- Analog Circuits, Analog Filters, Circuit Theory, Linear Systems, Digital Filters, Signal Theory Graduate courses (Linköping University): Multirate Digital Signal Processing, Wave Digital Filters

#### **Development of course books**

Co-author of the book Digital Filters, 448-674 pages, 1997-2013, (7 editions). Author of the book Discrete-Time Systems, 367 pages, 2007. Author of the book Tidsdiskreta System, 316 pages, 2001-2003 (3 editions). Co-author of the book Analoga Filter, 222-370 pages, 1999-2003 (5 editions).

### Supervision of final projects

About 40 final projects at Master and Bachelor levels in 1997-2014.

## **Entrepreneurial achievements**

- Signal Processing Devices Sweden AB secures EUR 3 Million from SEB Företagsinvest, the venture capital arm of Skandinaviska Enskilda Banken, 2007.
- Winner of VINNOVA's competition Vinn Nu 2004.
- Cofounder of the company Signal Processing Devices Sweden AB, 2004.

# LIST OF PUBLICATIONS FOR OSCAR GUSTAFSSON 2007–

Citation count from Google Scholar 2015-03-30

# **JOURNAL PAPERS**

- [J1] \* O. Gustafsson, "Lower bounds for constant multiplication problems," IEEE Trans. Circuits Syst.-II, vol. 54, no. 11, pp. 974-978, Nov. 2007. Number of citations: 60.
- O. Gustafsson, "Comments on 'A 70 MHz multiplierless FIR Hilbert transformer in 0.35 µm standard CMOS library"," [J2] IEICE Trans. Fundamentals, vol. E91-A, no. 3, pp. 899-900. Mar. 2008. Number of citations: 1.
- K. Johansson, O. Gustafsson, and L. Wanhammar, "Implementation of elementary functions for logarithmic number [J3] systems," IET Computers and Digital Techniques, vol. 2, no. 4, pp. 295-304, July 2008. Number of citations: 12.
- [J4] \* A. Blad and O. Gustafsson, "Integer linear programming-based bit-level optimization for high-speed FIR decimation filter architectures," Circuits, Systems, and Signal Processing, vol. 29, no. 1, pp. 81-101, Feb. 2010. Number of citations: 14.
- O. Gustafsson and F. Qureshi, "Addition aware quantization for low complexity and high precision constant multiplication," *IEEE Signal Processing Letters*, vol. 17, no. 2, pp. 173–176, Feb. 2010. Number of citations: 12. [J5]
- M. Faust, O. Gustafsson, and C.-H. Chang, "Fast and VLSI efficient binary-to-CSD encoder using bypass signal," *Electronics* [J6]
- *Letters*, vol. 47, no. 1, Jan. 6, 2011. Number of citations: 4. E. G. Larsson and O. Gustafsson, "The impact of dynamic voltage and frequency scaling on multicore DSP algorithm design," *IEEE Signal Processing Mag*, vol. 28, no. 3, pp. 127–144, May 2011. Number of citations: 5. [J7]
- [J8] M. Garrido, J. Grajal, and O. Gustafsson, "Optimum circuits for bit reversal," IEEE Trans. Circuits Syst. II, vol. 58, no. 10, pp. 657-661, Oct. 2011. Number of citations: 9.
- [J9] M. Garrido, O. Gustafsson, and J. Grajal, "Accurate rotations based on coefficient scaling," IEEE Trans. Circuits Syst. II, vol. 58, no. 10, pp. 662-666, Oct. 2011. Number of citations: 7.
- [J10] F. Qureshi and O. Gustafsson, "Low-complexity constant multiplication based on trigonometric identities with applications to FFTs," IEICE Trans. Fundamentals of Electronics, Communications, and Computer Sciences, vol. E94-A, no. 11, pp. 2361-2368, Nov. 2011. Number of citations: 5.
- S. Athar, O. Gustafsson, F. Qureshi and I. Kale, "On the efficient computation of single-bit input word length pipelined FFTs," *IEICE Electronics Express*, vol. 8, no. 17, Sept. 10, 2011. Number of citations: 0. [J11]
- [J12] Z. U. Sheikh and O. Gustafsson, "Linear programming design of coefficient decimation FIR filters," IEEE Trans. Circuits Syst. II, vol. 59, no. 1, Jan 2012. Number of citations: 4.
- \* M. Garrido, J. Grajal, M. Sanchez, and O. Gustafsson, "Pipelined radix-2^k feedforward FFT architectures," IEEE Trans. [J13] VLSI Systems, vol. 21, no. 1, pp. 23-32, Jan. 2013. Number of citations: 40.
- F. Qureshi, M. Garrido, and O. Gustafsson, "Unified architecture for 2, 3, 4, 5, and 7-point DFTs based on the Winograd Fourier transform algorithm," *Electronics Letters*, vol. 49, no. 5, Feb. 28 2013. Number of citations: 2. [J14]
- M. Abbas, O. Gustafsson, and H. Johansson, "On the fixed-point implementation of fractional-delay filters based on the [J15] Farrow structure," IEEE Trans. Circuits Syst. I - Regular Papers, vol. 60, no. 4, pp. 926–937, Apr. 2013. Number of citations:
- A. Eghbali, H. Johansson, O. Gustafsson, and S. J. Savory, "Optimal least-squares FIR digital filters for compensation of chromatic dispersion in digital coherent optical receivers," *IEEE/OSA J. Lightwave Technology*, vol. 32, no. 8, pp. 1449-[J16] 1456, Apr. 2014. Number of citations: 5.
- \* M. Garrido, F. Qureshi, and O. Gustafsson, "Low-complexity multiplierless constant rotators based on combined [J17] coefficient selection and shift-and-add implementation (CCSSI)," IEEE Trans. Circuits Syst. I, vol. 61, no. 7, pp. 2002–2012, 2014. Number of citations: 0.
- [J18] S. A. Alam and O. Gustafsson, "Design of finite word length linear-phase FIR filters in the logarithmic number system domain," VLSI Design, 2014. Number of citations: 0.
- N. Afzal, O. Gustafsson, and J. J. Wikner, "Reducing Complexity and Power of Digital Multi-Bit Error-Feedback Delta-[J19] Sigma-Modulators," IEEE Trans. Circuits Syst. II, 2014. Number of citations: 0.

## INTERNATIONAL CONFERENCE PAPERS

- S. Tahmasbi Oskuii, P. G. Kjeldsberg, and O. Gustafsson, "Transition-activity aware design of reduction-stages for parallel multipliers," *Great Lakes Symp. on VLSI*, Stresa-Lago Maggiore, Italy, March 11-13, 2007. Number of citations: 11. [C1]
- O. Gustafsson and H. Johansson, "Complexity comparison of linear-phase Mth-band and general FIR filters," *IEEE Int. Symp. Circuits Syst.*, New Orleans, LA, May 27-30, 2007, pp. 2335–2338. Number of citations: 1. [C2]
- [C3] O. Gustafsson, "A difference based adder graph heuristic for multiple constant multiplication problems," IEEE Int. Symp. Circuits Syst., New Orleans, LA, May 27-30, 2007, pp. 1097–1100. Number of citations: 62.
- O. Gustafsson and M. Olofsson, "Complexity reduction of constant matrix computations over the binary field," in Proc. Int. [C4] Workshop on the Arithmetic of Finite Fields, Madrid, Spain, June 21-22, 2007, pp. 103–115. Number of citations: 9.
- O. Gustafsson, S. Tahmasbi Oskuii, K. Johansson, and P. G. Kjeldsberg, "Switching activity reduction of MAC-based FIR filters with correlated input data," in *Proc. Int. Workshop Power Timing Modeling, Optimization, Simulation*, Göteborg, Sweden, Sept. 3–5, 2007, pp. 526–535. Number of citations: 4. [C5]
- A. Eghbali, O. Gustafsson, H. Johansson, and P. Löwenborg, "On the complexity of multiplierless direct and polyphase FIR filter structures," in *Proc. International Symposium on Image and Signal Processing and Analysis*, Istanbul, Turkey, Sept. 27–29, 2007, pp. 200–205. Number of citations: 4. [C6]
- L. Wanhammar, B. Soltanian, K. Johansson, and O. Gustafsson, "Synthesis of circulator-tree wave digital filters," in Proc. [C7] International Symposium on Image and Signal Processing and Analysis, Istanbul, Turkey, Sept. 27–29, 2007, pp. 206–211. Number of citations: 2.
- [C8] O. Gustafsson and L. Wanhammar, "Low-complexity constant multiplication using carry-save arithmetic for high-speed digital filters," in International Symposium on Image and Signal Processing and Analysis, Istanbul, Turkey, Sept. 27-29, 2007, pp. 212–217. Number of citations: 11.
- O. Gustafsson, L. S. DeBrunner, V. DeBrunner, and H. Johansson, "On the design of sparse half-band like FIR filters," in [C9] Proc. Asilomar Conf. Signals Syst. Comp., Pacific Grove, CA, Nov. 4-7, 2007. Number of citations: 26.

- [C10] S. Tahmasbi Oskuii, P. G. Kjeldsberg, and O. Gustafsson, "Power optimized partial product reduction interconnect ordering in parallel multipliers," in *Proc. IEEE Norchip Conf.*, Aalborg, Denmark, Nov. 19–20, 2007. Number of citations: 3.
- [C11] K. Johansson, O. Gustafsson, and L. Wanhammar, "Bit-level optimization of shift-and-add based FIR filters" IEEE Int. Conf. Elec. Circuits Syst., Marrakesh, Morocco, Dec. 11–14, 2007. Number of citations: 18.
- [C12] U. Meyer-Baese, O. Gustafsson, and A. Dempster, "The canonical minimised adder graph representation" in *Proceedings SPIE*, Mar. 2008. Number of citations: 2.
- [C13] S. Tahmasbi Oskuii, K. Johansson, O. Gustafsson, and P. G. Kjeldsberg, "Power optimization of weighted bit-product summation tree for elementary function generator," in *Proc. IEEE Int. Symp. Circuits Syst.*, Seattle, WA, May 18–21, 2008. Number of citations: 2.
- [C14] K. Johansson, O. Gustafsson, and L. Wanhammar, "Switching activity estimation for shift-and-add based constant multipliers," in *Proc. IEEE Int. Symp. Circuits Syst.*, Seattle, WA, May 18–21, 2008. Number of citations: 7.
- [C15] A. Blad and O. Gustafsson, "Bit-level optimized high-speed architectures for decimation filter applications," in Proc. IEEE Int. Symp. Circuits Syst., Seattle, WA, May 18–21, 2008. Number of citations: 5.
- [C16] L. Wanhammar, B. Soltanian, O. Gustafsson, and K. Johansson, "Synthesis of bandpass circulator-tree wave digital filters," in Proc. IEEE Int. Conf. Elec. Circ. Syst., St. Julians, Malta, Aug. 31–Sept. 3, 2008. Number of citations: 1.
- [C17] M. Abbas, F. Qureshi, Z. Sheikh, O. Gustafsson, H. Johansson, and K. Johansson, "Comparison of multiplierless implementation of nonlinear-phase versus linear-phase FIR filters," in *Proc. Asilomar Conf. Signals Syst. Comp.*, Pacific Grove, CA, Oct. 26–29, 2008. Number of citations: 0.
- [C18] E. Lindahl and O. Gustafsson, "Architecture-aware design of a decimation filter based on a dual wordlength multiply-accumulate unit," in *Proc. Asilomar Conf. Signals Syst. Comp.*, Pacific Grove, CA, Oct. 26–29, 2008. Number of citations: 1.
   [C19] O. Gustafsson, "Towards optimal multiple constant multiplication: a hypergraph approach," in *Proc. Asilomar Conf. Signals*
- [C19] O. Gustafsson, "Towards optimal multiple constant multiplication: a hypergraph approach," in *Proc. Asilomar Conf. Signals Syst. Comp.*, Pacific Grove, CA, Oct. 26–29, 2008. Number of citations: 9.
- [C20] O. Gustafsson and K. Johansson, "An empirical study on standard cell synthesis of elementary function look-up tables," in Proc. Asilomar Conf. Signals Syst. Comp., Pacific Grove, CA, Oct. 26–29, 2008. Number of citations: 2.
- [C21] A. Havashki, L. Lundheim, P. G. Kjeldsberg, O. Gustafsson, and G. E. Øien, "Analysis of switching activity in DSP signals in the presence of noise," *IEEE Eurocon*, St. Petersburg, Russia, May 18–23, 2009. Number of citations: 0.
- [C22] F. Qureshi and O. Gustafsson, "Low-complexity reconfigurable complex constant multiplication for FFTs," IEEE Int. Symp. Circuits Syst., Taipei, Taiwan, May 24–27, 2009. Number of citations: 23.
- [C23] M. Abbas, O. Gustafsson, and H. Johansson, "Scaling of fractional delay filters based on the Farrow structure," *IEEE Int. Symp. Circuits Syst.*, Taipei, Taiwan, May 24–27, 2009. Number of citations: 1.
- [C24] K. Johansson, O. Gustafsson, and L. DeBrunner, "Estimation of the Switching Activity in Shift-and-Add Based Computations," in *Proc. IEEE Int. Symp. Circuits Syst.*, Taipei, Taiwan, May 24–27, 2009. Number of citations: 1.
- [C25] F. Qureshi and O. Gustafsson, "Analysis of Twiddle Factor Memory Complexity of Radix-2<sup>^</sup>i Pipelined FFTs," in Proc. Asilomar Conf. Signals Syst. Comp., Pacific Grove, CA, Nov. 1–4, 2009. Number of citations: 2.
- [C26] O. Gustafsson and K. Johansson, "Techniques for Avoiding Sign-Extension in Multiple Constant Multiplication," in Proc. Asilomar Conf. Signals Syst. Comp., Pacific Grove, CA, Nov. 1–4, 2009. Number of citations: 2.
- [C27] K. Johansson, L. S. DeBrunner, O. Gustafsson, and V. DeBrunner, "Design of Multiplierless FIR Filters with an Adder Depth Versus Filter Order Trade-Off," in *Proc. Asilomar Conf. Signals Syst. Comp.*, Pacific Grove, CA, Nov. 1–4, 2009. Number of citations: 3.
- [C28] F. Quereshi and O. Gustafsson, "Twiddle factor memory switching activity analysis of radix-2<sup>2</sup> and equivalent FFT algorithms," in *Proc. IEEE Int. Symp. Circuits Syst.*, Paris, France, May 30–June 2, 2010. Number of citations: 7.
- [C29] A. Blad and O. Gustafsson, "Redundancy reduction for high-speed FIR filter architectures based on carry-save adder trees," in Proc. IEEE Int. Symp. Circuits Syst., Paris, France, May 30–June 2, 2010. Number of citations: 3.
- [C30] O. Gustafsson, K. Amiri, D. Andersson, A. Blad, C. Bonnet, J. R. Cavallaro, J. Declerckz, A. Dejonghe, P. Eliardsson, M. Glasse, A. Hayar, L. Hollevoet, C. Hunter, M. Joshi, F. Kaltenberger, R. Knopp, K. Le, Z. Miljanic, P. Murphy, F. Naessens, N. Nikaein, D. Nussbaum, R. Pacalet, P. Raghavan, A. Sabharwal, O. Sarode, P. Spasojevic, Y. Sun, H. M. Tullberg, T. Vander Aa, L. Van der Perre, M. Wetterwald, and M. Wu, "Architectures for cognitive radio testbeds and demonstrators An overview," in *Proc. Int. Conf. Cognitive Radio Oriented Wireless Networks Comm.*, Cannes, France, June 9–11, 2010. Number of citations: 16.
- [C31] Z. U. Sheikh, O. Gustafsson, and L. Wanhammar, "Design of sparse non-periodic narrow-band and wide-band FRM-like FIR filters," in *Proc. IEEE Int. Conf. Green Circuits Syst.*, Shanghai, China, June 21-23, 2010. Number of citations: 1.
   [C32] M. Abbas, O. Gustafsson, and L. Wanhammar, "Power estimation of recursive and non-recursive CIC filters implemented in
- [C32] M. Abbas, O. Gustafsson, and L. Wanhammar, "Power estimation of recursive and non-recursive CIC filters implemented in deep-submicron technology," in *Proc. IEEE Int. Conf. Green Circuits Syst.*, Shanghai, China, June 21-23, 2010. Number of citations: 13.
- [C33] O. Gustafsson, K. Khursheed, M. Imran, and L. Wanhammar, "Generalized overlapping digit patterns for multi-dimensional sub-expression sharing," in *Proc. IEEE Int. Conf. Green Circuits Syst.*, Shanghai, China, June 21-23, 2010, pp. 65–68. Number of citations: 0.
- [C34] Z. U. Sheikh and O. Gustafsson, "Design of narrow-band and wide-band frequency response masking filters using sparse nonperiodic sub-filters," in *Proc. European Signal Processing Conf.*, Aalborg, Denmark, Aug. 23–27, 2010. Number of citations: 2.
- [C35] A. Blad, O. Gustafsson, M. Zheng, and Z. Fei, "Integer linear programming based optimization of puncturing sequences for quasi-cyclic low-density parity-check codes" in *Proc. Int. Symp. Turbo Codes & Iterative Inf. Processing*, Brest, France, Sept. 6–10, 2010. Number of citations: 1.
- [C36] M. Imran, K. Khursheed, M. O'Nils, and O. Gustafsson, "On the number representation in sub-expression sharing," in Proc. Int. Conf. Signal Elec. Syst., Gliwice, Poland, Sept. 7–10, 2010, pp. 17–20. Number of citations: 3.
- [C37] M. Abbas and O. Gustafsson, "Switching activity estimation of CIC filter integrators," in Proc. PrimeAsia, Shanghai, China, Sept. 22–24, 2010. Number of citations: 1.
- [C38] F. Qureshi, S. A. Alam, and O. Gustafsson, "4k-point FFT algorithms based on optimized twiddle factor multiplication for FPGAs," in *Proc. PrimeAsia*, Shanghai, China, Sept. 22–24, 2010. Number of citations: 3.
- [C39] A. Blad, O. Gustafsson, M. Zheng, and Z. Fei, "Rate-compatible LDPC code decoder using check-node merging," in Proc. Asilomar Conf. Signals, Syst. Comp., Pacific Grove, CA, Nov. 7–10, 2010. Number of citations: 1.
- [C40] C. Liu, O. Gustafsson, B. Ng, and B. Phillips, "Estimating Arithmetic for Decimation Filters," in Proc. Asilomar Conf. Signals, Syst. Comp., Pacific Grove, CA, Nov. 7–10, 2010. Number of citations: 0.

- [C41] M. Abbas and O. Gustafsson, "Low-Complexity Parallel Evaluation of Powers by Exploitation of Redundancy," in Proc. Asilomar Conf. Signals, Syst. Comp., Pacific Grove, CA, Nov. 7-10, 2010. Number of citations: 0.
- M. Faust, C.-H. Chang, and O. Gustafsson, "Reconfigurable Multiple Constant Multiplication Using Minimum Adder Depth," in *Proc. Asilomar Conf. Signals, Syst. Comp.*, Pacific Grove, CA, Nov. 7–10, 2010. Number of citations: 3. [C42]
- [C43] F. Qureshi, M. Garrido, and O. Gustafsson, "Alternatives for low-complexity rotators," in Proc. IEEE Int. Conf. Elec. Circuits, Syst., Athens, Greece, Dec. 12–15, 2010. Number of citations: 0.
- [C44] S. A. Alam and O. Gustafsson, "Implementation of time-multiplexed sparse periodic FIR filters for FRM on FPGAs," IEEE Int. Symp. Circuits Syst., Rio de Janeiro, Brazil, May 15-18, 2011. Number of citations: 2.
- K. Johansson, O. Gustafsson, L. S. DeBrunner, and L. Wanhammar, "Minimum adder depth multiple constant multiplication [C45] algorithm for low power FIR filters," IEEE Int. Symp. Circuits Syst., Rio de Janeiro, Brazil, May 15-18, 2011. Number of citations: 12.
- C. Ingemarsson and O. Gustafsson, "On Using the Logarithmic Number System for Finite Wordlenght Matrix Inversion," in [C46] Proc. IEEE Midwest Symp. Circuits Syst., Seoul, South Korea, Aug. 7-10, 2011. Number of citations: 0.
- F. Qureshi and O. Gustafsson, "Generation of All Radix-2 Fast Fourier Transform Algorithms Using Binary Trees," in Proc. [C47] European Conf. Circuit Theory Design, Linköping, Sweden, Aug. 29-31, 2011. Number of citations: 4.
- C. Ingemarsson and O. Gustafsson, "Finite Wordlength Properties of Matrix Inversion Algorithms in Fixed-point and [C48] Logarithmic Number systems," in Proc. European Conf. Circuit Theory Design, Linköping, Sweden, Aug. 29-31, 2011. Number of citations: 0.
- [C49] A. Blad and O. Gustafsson, "FPGA implementation of rate-compatible QC-LDPC code decoder," in Proc. European Conf. *Circuit Theory Design*, Linköping, Sweden, Aug. 29–31, 2011. Number of citations: 2. S. Athar and O. Gustafsson, "Optimization of AIQ Representations for Low Complexity Wavelet Transforms," in *Proc.*
- [C50] European Conf. Circuit Theory Design, Linköping, Sweden, Aug. 29-31, 2011. Number of citations: 4.
- T. Ahmed, M. Garrido, and O. Gustafsson, "A 512-point 8-parallel pipelined feedforward FFT for WPAN," in *Proc. Asilomar Conf. Signals Syst. Comp., Pacific Grove*, CA, Nov. 6-9, 2011. Number of citations: 6. [C51]
- [C52] M. Abbas and O. Gustafsson, "Computational and Implementation Complexity of Polynomial Evaluation Schemes," in Proc. IEEE NorChip Conf., Lund, Sweden, Nov. 14-15, 2011. Number of citations: 1.
- [C53] P. Källström and O. Gustafsson, "Magnitude Scaling for Increased SFDR in DDFS," in Proc. IEEE NorChip Conf., Lund, Sweden, Nov. 14-15, 2011. Number of citations: 0.
- S. A. Alam and O. Gustafsson, "Implementation of Narrow-Band Frequency-Response Masking for Efficient Narrow [C54] Transition Band FIR Filters on FPGAs," in Proc. IEEE NorChip Conf., Lund, Sweden, Nov. 14-15, 2011. Number of citations: 1.
- C. Ingemarsson, P. Källström, and O. Gustafsson, "Using DSP block pre-adders in pipeline SDF FFT implementations in [C55] contemporary FPGAs," in Proc. 22th Int. Conf. on Field Programmable Logic and Applications, Oslo, Norway, Aug. 29-31, 2012. Number of citations: 0.
- [C56] P. Källström, M. Garrido, and O. Gustafsson, "Low-complexity rotators for the FFT using base-3 signed stages," in Proc. IEEE Asia Pacific Conf. Circuits Syst., Nov. 30-Dec. 1, Kaohsiung, Taiwan, 2012. Number of citations: 1.
- [C57] P. P. Boopal, M. Garrido, O. Gustafsson, "A reconfigurable FFT architecture for variable-length and multi-streaming OFDM standards," in Proc. IEEE Int. Symp. Circuits Syst., Beijing, China, May 19-23, 2013. Number of citations: 0.
- \* O. Gustafsson and A. Ehliar, "Low-complexity general FIR filters based on Winograd's inner product algorithm," in Proc. [C58] IEEE Int. Symp. Circuits Syst., Beijing, China, May 19-23, 2013. Number of citations: 0.
- [C59] M. R. Sadeghifar, J J. Wikner and O. Gustafsson "Linear programming design of semi-digital FIR Filter and ΣΔ modulator for VDSL2 transmitter," in Proc. IEEE Int. Symp. Circuits Syst., Melbourne, Australia, 2014. Number of citations: 0.
- M. Garrido, M. Acevedo, A. Ehliar, and O. Gustafsson, "Challenging the limits of FFT performance on FPGAs," in Proc. [C60] Int. Symp. Integrated Circuits, 2014, pp. 172–175. Number of citations: 0.
- O. Gustafsson and H. Johansson, "Decimation filters for high-speed delta-sigma modulators with passband constraints: [C61] General versus CIC-based FIR filters," in Proc. IEEE Int. Symp. Circuits Syst., 2015. Number of citations: 0.

## **BOOK CHAPTERS**

- O. Gustafsson and L. Wanhammar, "Arithmetic," in Handbook of Signal Processing Systems, Springer, 2010. Number of [B1] citations: 3.
- [B2] O. Gustafsson and L. Wanhammar, "Low-complexity and high-speed constant multiplications for digital filters using carrysave arithmetic" in Digital Filters, Intech, 2011. Number of citations: 1.
- Håkan Johansson, Oscar Gustafsson, "Two-Rate Based Structures for Computationally Efficient Wide-Band FIR Systems," [B3] in *Digital Filters and Signal Processing*, 189-212, Intech, 2013. Number of citations: 0. O. Gustafsson and L. Wanhammar, "Arithmetic," in *Handbook of Signal Processing Systems*, 2nd edition, Springer, 2013.
- [B4] Number of citations: 3.

## MOST CITED PAPERS

- O. Gustafsson, A. G. Dempster, and L. Wanhammar, "Extended results for minimum-adder constant integer multipliers," in *Proc. IEEE Int. Symp. Circuits Syst.*, Scottsdale, AZ, May 26–29, 2002, vol. 1, pp. 73–76. Number of citations: 90. [N1]
- O. Gustafsson, "A difference based adder graph heuristic for multiple constant multiplication problems," IEEE Int. Symp. [N2] Circuits Syst., New Orleans, LA, May 27-30, 2007, pp. 1097–1100. Number of citations: 62.
- O. Gustafsson, "Lower bounds for constant multiplication problems," IEEE Trans. Circuits Syst.-II, vol. 54, no. 11, pp. 974-[N3] 978, Nov. 2007. Number of citations: 60.
- O. Gustafsson, A. G. Dempster, K. Johansson, M. D. Macleod, and L. Wanhammar, "Simplified design of constant coefficient multipliers," *Circuits, Systems and Signal Processing*, vol. 25, no. 2, pp. 225–251, Apr. 2006. Number of citations: [N4] 53.
- [N5] O. Gustafsson and L. Wanhammar, "ILP modelling of the common subexpression sharing problem," in Proc. IEEE Int. Conf. Electronics Circuits Syst., Dubrovnik, Croatia, Sept. 15-18, 2002, vol. 3, pp. 1171-1174. Number of citations: 53.

# Håkan Johansson's List of Publications

Number of citations taken from Google Scholar 2015-03-28. (http://scholar.google.se/citations?user=gHVEbL4AAAAJ&hl=sv)

## FIVE MOST CITED PAPERS

- 1. H. Johansson and P. Löwenborg, "Reconstruction of nonuniformly sampled bandlimited signals by means of digital fractional delay filters," *IEEE Trans. Signal Processing*, vol. 50, no. 11, pp. 2757–2767, Nov. 2002. Number of citations: 143
- 2. H. Johansson and P. Löwenborg, "On the design of adjustable fractional delay FIR filters," *IEEE Trans. Circuits Syst. II*, vol. 50, no. 4, pp. 164–169, Apr. 2003. Number of citations: 92
- 3. C. Vogel and H. Johansson, "Time-interleaved analog-to-digital converters: Status and future directions," in *Proc. IEEE Int. Symp. Circuits Syst.*, Kos, Greece, May 21–24, 2006. Number of citations: 73
- 4. P. Löwenborg, H. Johansson, and L. Wanhammar, "Two-channel digital and hybrid analog/digital multirate filter banks with very low complexity analysis or synthesis filters," *IEEE Trans. Circuits Syst. II*, vol. 50, no. 7, pp. 355–367, July 2003. Number of citations: 59
- H. Johansson and P. Löwenborg, "Reconstruction of nonuniformly sampled bandlimited signals by means of time-varying discrete-time FIR filters," EURASIP J. Applied Signal Processing – Special Issue on Frames and Overcomplete Representations in Signal Processing, Communications, and Information Theory, vol. 2006, Article ID 64185, 18 pages, 2006. Number of citations: 54

\* Five most important publications for the project

## PEER-REVIEWED JOURNAL PAPERS, 2007-2015

- [42] Y. Wang, H Xu, H. Johansson, Z. Sun, and J. J. Wikner, "Digital estimation and compensation method for nonlinearity mismatches in time-interleaved analog-to-digital converters", *Digital Signal Processsing (Elsevier)*, accepted. Number of citations: 0
- [41] Y. Wang, H. Johansson, H. Xu, and Z. Sun, "Joint blind calibration for mixed mismatches in two-channel time-interleaved ADCs," *IEEE Trans. Circuits Syst. I - Regular Papers*, accepted. Number of citations: 0
- [40] A. K. M. Pillai and H. Johansson, "Prefilter-based reconfigurable reconstructor for time-interleaved ADCs with missing samples," *IEEE Trans. Circuits Syst. II Express Briefs*, accepted. Number of citations: 0
- [39] H. Johansson and H. Göckler, "Two-stage based polyphase structures for arbitrary-integer sampling rate conversion," *IEEE Trans. Circuits Syst. II Express Briefs*, accepted. Number of citations: 0
- [38] A. Eghbali and H. Johansson, "Design of modulated filter banks and transmultiplexers with unified initial solutions and very few unknown parameters," *IEEE Trans. Circuits Syst. II Express Briefs*, accepted. Number of citations: 0
- [37]\* H. Johansson and F. Harris, "Polyphase decomposition of digital fractional-delay filters," *IEEE Signal Processing Lett.*, vol. 22, no 8, pp. 1021–1025, Aug. 2015. Number of citations: 0
- [36] Y. Wang, H. Johansson, and H. Xu, "Adaptive background estimation for static nonlinearity mismatches in two-channel TIADCs," *IEEE Trans. Circuits Syst. II - Express Briefs*, vol. 62, no. 3, 226–230, Mar. 2015. Number of citations: 0
- [35]\* H. Johansson, "Relations between zero-IF receiver I/Q and TI-ADC channel mismatches," *IEEE Trans. Signal Processing*, vol. 62, no. 13, 3403–3414, July 2014. Number of citations: 1
- [34] H. Johansson and A. Eghbali, "Add-equalize structures for linear-phase Nyquist FIR filter interpolators and decimators," *IEEE Trans. Circuits Syst. I: Regular Papers*, vol. 61, no. 6, pp. 1766–1777, June 2014. Number of citations: 2
- [33] H. Johansson and A. Eghbali, "Two polynomial FIR filter structures with variable fractional delay and phase shift," *IEEE Trans. Circuits Syst. I: Regular Papers, Special Issue on ISCAS-2013, Invited Paper*, vol. 61, no. 5, pp. 1355–1365, May 2014. Number of citations: 0
- [32]\* A. Eghbali, H. Johansson, O. Gustafsson, and S. J. Savory, "Optimal least-squares FIR digital filters for compensation of chromatic dispersion in digital coherent optical receivers," *IEEE/OSA J. Lightwave Technology*, vol 32, no. 8, pp. 1449–1456, Apr. 15, 2014. Number of citations: 5
- [31] W. J. Xu, Y. J. Yu, and H. Johansson, "Improved filter bank approach for the design of variable bandedge and fractional delay filters," *IEEE Trans. Circuits Syst. I: Regular Papers*, vol. 61, no. 3, pp. 764–777, Mar. 2014. Number of citations: 0
- [30] A. Eghbali, T. Saramäki, and H. Johansson, "Conditions for *L*th-band filters of order 2*N* as cascades of identical linear-phase FIR spectral factors of order *N*," *Signal Processing*, vol. 97, pp. 1–8, Apr. 2014.

Number of citations: 0

- [29]\* A. Eghbali and H. Johansson, "On efficient design of high-order filters with applications to filter banks and transmultiplexers with large number of channels," *IEEE Trans. Signal Processing*, vol. 62, no. 5, pp. 1198-1209, Mar. 2014. Number of citations: 2
- [28] A. Eghbali and H. Johansson, "A class of reconfigurable and low-complexity two-stage Nyquist filters," *Signal Processing*, vol. 96, pp. 164–172, Mar. 2014. Number of citations: 1
- [27] A. K. M. Pillai and H. Johansson, "Efficient signal reconstruction scheme for *M*-channel time-interleaved ADCs" *J. Analog Integrated Circuits Signal Processing, Special Issue on IEEE NEWCAS 2012*, vol. 77, no. 2, pp. 113–122, Nov. 2013. *Invited Paper*. Number of citations: 0
- [26] H. Johansson, "On FIR filter approximation of fractional-order differentiators and integrators," *IEEE J. Emerging Selected Topics Circuits Syst.*, vol. 3, no. 3, pp. 404–415, Sept. 2013. Number of citations: 2
- [25] M. Abbas, O. Gustafsson, and H. Johansson, "On the fixed-point implementation of fractional-delay filters based on the Farrow structure," *IEEE Trans. Circuits Syst. I - Regular Papers*, vol. 60, no. 4, pp. 926–937, Apr. 2013. Number of citations: 8
- [24] A. Eghbali, H. Johansson, T. Saramäki, "A method for the design of Farrow-structure based variable fractional-delay FIR filters," *Signal Processing*, vol. 93, no. 5, pp. 1341–1348, May 2013. Number of citations: 5
- [23] H. Johansson and E. Hermanowicz, "Two-rate based low-complexity variable fractional-delay FIR filter structures," *IEEE Trans. Circuits Syst. I: Regular Papers*, vol. 60, no. 1, pp. 136–149, Jan. 2013. Number of citations: 10
- [22] A. K. M. Pillai and H. Johansson, "Two-rate based low-complexity time-varying discrete-time FIR reconstructors for two-periodic nonuniformly sampled signals" *Sampling Theory in Signal and Image Processing Special Issue on SampTA 2011*, vol. 11, no. 2-3, pp. 195-220, 2012. Number of citations: 2
- [21] Z. U. Sheikh and H. Johansson, "Efficient wide-band FIR LTI systems derived via multi-rate techniques and sparse bandpass filters", *IEEE Trans. Signal Processing*, vol. 60, no. 7, pp. 3859–3863, July 2012. Number of citations: 8
- [20] H. Johansson, "Fractional-delay and supersymmetric *M*th-band linear-phase FIR filters utilizing partially symmetric and antisymmetric impulse responses", *IEEE Trans. Circuits Syst. II - Express Briefs*, vol. 59, no. 6, pp. 366–370, June 2012. Number of citations: 3
- [19] A. Eghbali, H. Johansson, and P. Löwenborg, "A class of multimode transmultiplexers based on the Farrow structure," *Circuits Syst. Signal Processing*, vol. 31, no. 3, pp. 961–985, June 2012. Number of citations: 9
- [18] Z. U. Sheikh and H. Johansson, "A technique for efficient realization of wide-band FIR LTI systems", *IEEE Trans. Signal Processing*, vol. 60, no. 3, pp. 1482–1486, Mar. 2012. Number of citations: 3
- [17] A. Eghbali, T. Saramäki, and H. Johansson, "On two-stage Nyquist pulse shaping filters," *IEEE Trans. Signal Processing*, vol. 60, no. 1, pp. 483–488, Jan. 2012. Number of citations: 6
- [16] Z. U. Sheikh and H. Johansson, "A class of wide-band linear-phase FIR differentiators using a two-rate approach and the frequency-response masking technique," *IEEE Trans. Circuits Syst. I: Regular Papers*, vol. 58, no. 8, pp. 1827–1839, Aug. 2011. Number of citations: 14
- [15] A. Eghbali, H. Johansson, and P. Löwenborg, "Reconfigurable nonuniform transmultiplexers using uniform modulated filter banks," *IEEE Trans. Circuits Syst. I: Regular Papers*, vol. 58, no. 3, pp. 539–547, Mar. 2011. Number of citations: 12
- [14] A. Eghbali, H. Johansson, P. Löwenborg, and H. G. Göckler, "Dynamic frequency-band reallocation and allocation: From satellite-based communication systems to cognitive radios," *J. Signal Processing Syst., Signal, Image, Video Technology*, vol. 62, no. 2, 187–203, Feb. 2011. Number of citations: 17
- [13] H. Johansson, "Farrow-structure-based reconfigurable bandpass linear-phase FIR filters for integer sampling rate conversion," *IEEE Trans. Circuits Syst. II: Express Briefs*, vol. 58, no. 1, pp. 46–50, Jan. 2011. Number of citations: 10
- [12] L. Rosenbaum, P. Löwenborg, and H. Johansson, "Two classes of cosine-modulated IIR/IIR and IIR/ FIR NPR filter banks," *Circuits, Syst., Signal Processing, Special issue on low power digital filter design techniques and their applications*, vol. 29, no. 1, pp. 103–133, Feb. 2010. Number of citations: 5
- [11] H. Johansson, "A polynomial-based time-varying filter structure for the compensation of frequencyresponse mismatch errors in time-interleaved ADCs," *IEEE J. Selected Topics in Signal Processing, Special Issue on DSP Techniques for RF/Analog Circuit Impairments*, vol. 3, no. 3, pp. 384–396, June 2009. Number of citations: 38
- [10] A. Eghbali, H. Johansson, and P. Löwenborg, "Flexible frequency-band reallocation: Complex versus real," *Circuits, Syst., Signal Processing*, vol. 28, no. 3, pp. 409–431, June 2009. Number of citations: 4
- [9] H. Johansson and P. Löwenborg, "Least-squares filter design technique for the compensation of frequency-response mismatch errors in time-interleaved analog-to-digital converters," *IEEE Trans. Cir*-

cuits Syst. II, vol. 55, no. 11. pp. 1154–1158, Nov. 2008. Number of citations: 32

- [8] E. Hermanowicz, H. Johansson, and M. A. Rojewski, "A fractionally delaying complex Hilbert transform filter," *IEEE Trans. Circuits Syst. II*, vol. 55, no. 5, pp. 452–456, May 2008. Number of citations: 6
- [7] A. Eghbali, H. Johansson, and P. Löwenborg, "A multimode transmultiplexer structure," *IEEE Trans. Circuits Syst. II, Special Issue on Multifunctional Circuits and Systems for Future Generations of Wireless Communications-I*, vol. 55, no. 3, pp. 279–283, Mar. 2008. Number of citations: 18
- [6] A. Blad, H. Johansson, and P. Löwenborg, "Multirate formulation for mismatch sensitivity analysis of analog-to-digital converters that utilize parallel sigma-delta modulators," *EURASIP J. Advances Signal Processing*, vol. 2008, Article ID 289184, 11 pages, 2008. Number of citations: 3
- [5] E. Hermanowicz and H. Johansson, "A complex variable fractional delay FIR filter structure," *IEEE Trans. Circuits, Syst. II*, vol. 54, no. 9, pp. 785–789, Sept. 2007. Number of citations: 6
- [4] H. Johansson, P. Löwenborg, and K. Vengattaramane, "Least-squares and minimax design of polynomial impulse response FIR filters for reconstruction of two-periodic nonuniformly sampled signals," *IEEE Trans. Circuits Syst. I*, vol. 54, no. 4, pp. 877–888, Apr. 2007. Number of citations: 30
- [3] L. Rosenbaum and H. Johansson, "On low-delay frequency masking FIR filters," *Circuits, Syst., Signal Processing*, vol. 26, no. 1, pp. 1-25, Feb. 2007. Number of citations: 7
- [2] L. Rosenbaum, P. Löwenborg, and H. Johansson, "An approach for synthesis of modulated *M*-channel FIR filter banks utilizing the frequency-response masking technique," *EURASIP J. Advances Signal Processing Special Issue on Multirate Systems and Applications*, vol. 2007, Article ID 68285, 13 pages, 2007. Number of citations: 15
- [1] H. Johansson and P. Löwenborg, "Flexible frequency-band reallocation network using variable oversampled complex-modulated filter banks," *EURASIP J. Advances Signal Processing – Special Issue on Multirate Systems and Applications*, vol. 2007, Article ID 63714, 15 pages, 2007. Number of citations: 24

#### **PEER-REVIEWED CONFERENCE PAPERS, 2007-2015**

- [32] O. Gustafsson and H. Johansson, "Decimation filters for high-speed delta-sigma modulators with passband constraints: General versus CIC-based FIR filters, in *Proc. IEEE Int. Symp. Circuits Syst.*, Lisbon, Portugal, May 24–27, 2015. Number of citations: 0
- [31] Y. Wang, H. Johansson, H. Xu, and Z. Sun, "Blind calibration of nonlinearity mismatch errors in twochannel time-interleaved ADCs," in *Proc. IEEE Int. Conf. Electronics Circuits Syst.*, Marseille, France, Dec. 7–10, 2014. Number of citations: 0
- [30] A. K. M. Pillai and H. Johansson, "Two reconstructors for *M*-channel time-interleaved ADCs with missing samples", in *Proc. IEEE Int. New Circuits Syst. Conf.*, Trois-Rivières, Canada, June 22–25, 2014. Number of citations: 1
- [29]\* A. K. M. Pillai and H. Johansson, "A sub-band based reconstructor for *M*-channel time-interleaved ADCs with missing samples," *IEEE Int. Conf. Acoust., Speech, Signal Processing*, Florence, Italy, May 4–9, 2014. Number of citations: 2
- [28] A. K. M. Pillai and H. Johansson, "Efficient reconfigurable scheme for the recovery of sub-Nyquist sampled sparse multi-band signals," in *Proc. IEEE GlobalSIP Symp. Software Defined and Cognitive Radios*, Austin, TX, USA, Dec. 3–5, 2013. *Invited Paper*. Number of citations: 1
- [27] A. K. M. Pillai and H. Johansson, "Low-complexity two-rate based multivariate impulse response reconstructor for time-skew error correction in *M*-channel time-interleaved ADCs" in *Proc. IEEE Int. Symp. Circuits Syst.*, Beijing, China, May 19–23, 2013. Number of citations: 1
- [26] H. Johansson and A. Eghbali, "FIR filter with variable fractional delay and phase shift: Efficient realization and design using reweighted l<sub>1</sub>-norm minimization" in *Proc. IEEE Int. Symp. Circuits Syst.*, Beijing, China, May 19–23, 2013. Number of citations: 5
- [25] H. Johansson and A. Eghbali, "A realization of FIR filters with simultaneously variable bandwidth and fractional delay," in *Proc. European Signal Processing Conf.*, Bucharest, Romania, Aug. 27–31, 2012. Number of citations: 1
- [24] A. K. M. Pillai and H. Johansson, "Time-skew error correction in two-channel time-interleaved ADCs based on a two-rate approach and polynomial impulse responses" in *Proc. IEEE Midwest Symp. Circuits Syst.*, Boise, Idaho, USA, Aug. 5–8, 2012. Number of citations: 2
- [23] A. K. M. Pillai and H. Johansson, "Efficient signal reconstruction scheme for time-interleaved ADCs," in *Proc. IEEE Int. Northeast Workshop Circuits Syst.*, Montreal, Canada, June 17–20, 2012. Number of citations: 5
- [22] A. Eghbali and H. Johansson, "Reconfigurable two-stage Nyquist filters utilizing the Farrow structure," in *Proc. IEEE Int. Symp. Circuits Syst.*, Seoul, Korea, May 20–23, 2012. Number of citations: 1

- [21] H. Johansson, A. Eghbali, and J. Lahti, "Tree-structured linear-phase Nyquist FIR filter interpolators and decimators," in *Proc. IEEE Int. Symp. Circuits Syst.*, Seoul, Korea, May 20–23, 2012. Number of citations: 2
- [20] Z. U. Sheikh, A. Eghbali, and H. Johansson, "Linear-phase FIR digital differentiator order estimation," in *Proc. European Conf. Circuit Theory Design*, Linköping, Sweden, Aug. 29–31, 2011. Number of citations: 3
- [19] A. Eghbali and H. Johansson, "Complexity reduction in low-delay Farrow-structure-based variable fractional delay FIR filters utilizing linear-phase subfilters," in *Proc. European Conf. Circuit Theory Design*, Linköping, Sweden, Aug. 29–31, 2011. Number of citations: 1
- [18] A. Eghbali, H. Johansson, and T. Saramäki, "A new structure for reconfigurable two-stage Nyquist pulse shaping filters", in *Proc. IEEE Midwest Symp. Circuits Syst.*, Seoul, Korea, Aug. 7–10, 2011. Number of citations: 1
- [17] A. Peng, S. Zhuang, H. Johansson, and L. Wanhammar, "On filter symmetry polarities and lengths of *M*-channel linear-phase PR filter banks," in Proc. 3rd International Congress on Image and Signal Processing, Yantai, China, Oct. 15–17, 2010, vol. 6, pp. 3008–3012. Number of citations: 1
- [16] A. Eghbali, H. Johansson, T. Saramäki, and P. Löwenborg, "On the design of adjustable fractional delay FIR filters using digital differentiators," in *Proc. IEEE Int. Conf. Green Circuits Syst.*, Shanghai, China, June 21–23, 2010. Number of citations: 0
- [15] A. Eghbali, H. Johansson, and P. Löwenborg, "Reconfigurable nonuniform transmultiplexers based on uniform filter banks," in *Proc. IEEE Int. Symp. Circuits Syst.*, Paris, France, May 30–June 2, 2010. Number of citations: 7
- [14] Z. Sheikh and H. Johansson, "Wideband linear-phase FIR differentiators utilizing multirate and frequency-response masking techniques," in *Proc. IEEE Int. Symp. Circuits Syst.*, Taipei, Taiwan, May 24–27, 2009. Number of citations: 3
- [13] A. Eghbali, H. Johansson, and P. Löwenborg, "On the filter design for a class of multimode transmultiplexers," in *Proc. IEEE Int. Symp. Circuits Syst.*, Taipei, Taiwan, May 24–27, 2009. Number of citations: 9
- [12] M. Abbas, O. Gustafsson, and H. Johansson, "Scaling of fractional delay filters based on the Farrow structure," in *Proc. IEEE Int. Symp. Circuits Syst.*, Taipei, Taiwan, May 24–27, 2009. Number of citations: 1
- [11] H. Johansson and C. Vogel, "Efficient design and implementation of sampling rate conversion, resampling, and signal reconstruction methods," in *Proc. 8th Int. Conf. Sampling Theory and Applications*, Marseille, France, May 18–22, 2009. Number of citations: 0
- [10] H. Johansson, "On the compensation of frequency-response mismatch errors in time-interleaved ADCs," in *Proc. Int. Conf. Microelectronics*, Sharjah, UAE, Dec. 2008, *Invited Paper*. Number of citations: 0
- [9] M. Abbas, F. Qureshi, Z. Sheikh, O. Gustafsson, H. Johansson, and K. Johansson, "Comparison of multiplierless implementation of nonlinear-phase versus linear-phase FIR filters," *Asilomar Conf. Signals Syst. Comp.*, Pacific Grove, CA, Oct. 26–29, 2008. Number of citations: 0
- [8] A. Eghbali, H. Johansson, and P. Löwenborg, "A Farrow-structure based multi-mode transmultiplexer" in *Proc. IEEE Int. Symp. Circuits Syst.*, Seattle, Washington, USA, May 18–21, 2008. Number of citations: 18
- [7] S. Ahmad, N. Ahsan, A. Blad, R. Ramzan, H. Johansson, J. Dabrowski, and C. Svensson, "Feasibility of filter-less RF receiver front-end", in *Proc. GigaHertz Symposium*, Gothenburg, 5–6 Mar. 2008. Number of citations: 0
- [6] O. Gustafsson, L. S. DeBrunner, V. DeBrunner, and H. Johansson, "On the design of sparse half-band like FIR filters," Asilomar Conf. Signals Syst. Comp., Pacific Grove, CA, Nov. 4–7, 2007. Number of citations: 26
- [5] A. Eghbali, O. Gustafsson, H. Johansson, and P. Löwenborg, "On the complexity of multiplierless direct and polyphase FIR filter structures," in *Proc. Int. Symp. Image, Signal Processing*, Analysis, Istanbul, Turkey, Sept. 27–29, 2007. Number of citations: 4
- [4] A. Eghbali, H. Johansson, and P. Löwenborg, "Flexible frequency-band reallocation MIMO networks for real signals," in *Proc. Int. Symp. Image, Signal Processing*, Analysis, Istanbul, Turkey, Sept. 27-29, 2007. Number of citations: 8
- [3] A. Eghbali, H. Johansson, and P. Löwenborg, "An arbitrary bandwidth transmultiplexer and its application to flexible frequency-band reallocation networks," in *Proc. European Conf. Circuit Theory Design*, Seville, Spain, Aug. 26-30, 2007. Number of citations: 11
- [2] M. Olsson, H. Johansson, and P. Löwenborg, "Simultaneous estimation of gain, delay, and offset utilizing the Farrow structure," in *Proc. European Conf. Circuit Theory Design*, Seville, Spain, Aug. 26-30, 2007. Number of citations: 3

[1] O. Gustafsson and H. Johansson, "Complexity comparison of linear-phase *M*th-band and general FIR filters," in *Proc. IEEE Symp. Circuts Syst.*, New Orleans, USA, May 27–30, 2007. Number of citations: 1

## BOOK CHAPTERS, 2007-2015

- [3] H. Johansson, "Sampling and quantization," in *Academic Press Library in Signal Processing: Signal Processing Theory and Machine Learning*, vol. 1, pp. 169–244, 2013. Number of citations: 1
- [2] H. Johansson and O. Gustafsson, "Two-rate based structures for computationally efficient wide-band FIR systems," in *Digital Filters and Signal Processing*, Eds. F. P. Garcia Marquez and N. Zaman, ch. 8, pp. 189–212, 2013. Number of citations: 0
- A. Eghbali and H. Johansson, "Reconfigurable multirate systems in cognitive radios," in *Foundation of Cognitive Radio Systems*, S. Cheng, Ed., ISBN 978-953-51-0268-7, 2012, 22 pages. Number of citations: 0

## BOOKS, 2007-2015

- [2] L. Wanhammar and H. Johansson, *Digital Filters Using Matlab*, 650 pages, Linköping University, 2013. Number of citations: 5 (older version from 2002 cited 75 times)
- [1] H. Johansson, Discrete-Time Systems, 367 pages, Linköping University, 2007. Number of citations: 1

## PATENTS, 2007-2015

- [6] H. Johansson, "Methods and apparatuses for estimation and compensation of nonlinearity errors", United States Patent 8,825,415, Sept. 2, 2014. Number of citations: 0
- [5] H. Johansson, "Methods and apparatuses for compensation of I/Q imbalance", United States Patent 8,588,336, Nov. 19, 2013. Number of citations: 0
- [4] H. Johansson and P. Löwenborg, "Method and a system for estimating errors introduced in a time-interleaved analog-to-digital converter system," United States Patent 8,307,248, Nov. 6, 2012. Number of citations: 3
- [3] H. Johansson and P. Löwenborg, "Compensation of mismatch errors in a time-interleaved analog-todigital converter", United States Patent 8,009,070, Aug. 30, 2011. Number of citations: 6
- [2] H. Johansson, "Compensation of mismatch errors in a time-interleaved analog-to-digital converter", United States Patent 7,978,104, July 12, 2011. Number of citations: 3
- [1] H. Johansson and P. Löwenborg, "Estimation of timing errors in a time-interleaved analog-to-digital converter system," United States Patent 7,741,982, June 22, 2010. Number of citations: 3

# CV

Name:Oscar Gustafsson Birthdate: 19730330 Gender: Male Doctorial degree: 2003-09-26 Academic title: Docent Employer: Linköpings universitet

# **Research education**

| Dissertation title (swe)<br>Bidrag till digitala filter med låg ko    | omplexitet                                 |                    |
|-----------------------------------------------------------------------|--------------------------------------------|--------------------|
| Dissertation title (en)<br>Contribution to Low-Complexity Dig         | gital Filters                              |                    |
| Organisation                                                          | Unit                                       | Supervisor         |
| Linköpings universitet, Sweden<br>Sweden - Higher education Institute | Institutionen för systemteknik (ISY)<br>es | Lars Wanhammar     |
| Subject doctors degree                                                | ISSN/ISBN-number                           | Date doctoral exam |
| 20299. Annan elektroteknik och<br>elektronik                          | 91-7373-702-X                              | 2003-09-26         |

| Name:Håkan Johansson | Doctorial degree: 1998-05-18     |  |
|----------------------|----------------------------------|--|
| Birthdate: 19690701  | Academic title: Professor        |  |
| Gender: Male         | Employer: Linköpings universitet |  |

# **Research education**

| <b>Dissertation title (swe)</b><br>Syntes och Realisering av Höghasti         | ghet Rekursiva Digitala Filter                  |                                  |
|-------------------------------------------------------------------------------|-------------------------------------------------|----------------------------------|
| Dissertation title (en)<br>Synthesis and Realization of High-                 | Speed Recursive Digital Filters                 |                                  |
| Organisation                                                                  | Unit                                            | Supervisor                       |
| Linköpings universitet, Sweden<br>Sweden - Higher education Institut          | Institutionen för systemteknik (ISY)<br>es      | Lars Wanhammar                   |
| <b>Subject doctors degree</b><br>20299. Annan elektroteknik och<br>elektronik | ISSN/ISBN-number                                | Date doctoral exam<br>1998-05-18 |
| Publications                                                                  |                                                 |                                  |
| Name:Oscar Gustafsson                                                         | e:Oscar Gustafsson Doctorial degree: 2003-09-26 |                                  |
| Birthdate: 19730330                                                           | te: 19730330 Academic title: Docent             |                                  |
| Gender: Male Employer: Linköpings univer                                      |                                                 | nings universitet                |

Gustafsson, Oscar has not added any publications to the application.

# Publications

Name:Håkan Johansson Birthdate: 19690701 Gender: Male Doctorial degree: 1998-05-18 Academic title: Professor Employer: Linköpings universitet Johansson, Håkan has not added any publications to the application.

## Register

## Terms and conditions

The application must be signed by the applicant as well as the authorised representative of the administrating organisation. The representative is normally the department head of the institution where the research is to be conducted, but may in some instances be e.g. the vice-chancellor. This is specified in the call for proposals.

The signature from the applicant confirms that:

- the information in the application is correct and according to the instructions form the Swedish Research Council
- any additional professional activities or commercial ties have been reported to the administrating organisation, and that no conflicts have arisen that would conflict with good research practice
- that the necessary permits and approvals are in place at the start of the project e.g. regarding ethical review.

The signature from the administrating organisation confirms that:

- the research, employment and equipment indicated will be accommodated in the institution during the time, and to the extent, described in the application
- the institution approves the cost-estimate in the application
- the research is conducted according to Swedish legislation.

The above-mentioned points must have been discussed between the parties before the representative of the administrating organisation approves and signs the application.

Project out lines are not signed by the administrating organisation. The administrating organisation only sign the application if the project outline is accepted for step two.

Applications with an organisation as applicant is automatically signed when the application is registered.