Provided for non-commercial research and education use. Not for reproduction, distribution or commercial use.



This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues.

Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited.

In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier's archiving and manuscript policies are encouraged to visit:

http://www.elsevier.com/copyright

Microelectronics Journal 42 (2011) 33-42

Contents lists available at ScienceDirect





### Microelectronics Journal

journal homepage: www.elsevier.com/locate/mejo

# Power efficient asynchronous multiplexer for X-ray sensors in medical imaging analog front-end electronics

Rafał Długosz<sup>a,b,\*</sup>, Pierre-André Farine<sup>a</sup>, Kris Iniewski<sup>c</sup>

<sup>a</sup> Swiss Federal Institute of Technology in Lausanne, Institute of Microtechnology, Rue A.-L. Breguet 2, CH-2000 Neuchâtel, Switzerland

<sup>b</sup> University of Technology and Life Sciences, Faculty of Telecommunication and Electrical Engineering, ul. Kaliskiego 7, 85-796 Bydgoszcz, Poland

<sup>c</sup> CMOS Emerging Technologies Inc., 2865 Stanley Pl., Coquitlam, BC V3B 7L7, Canada

#### ARTICLE INFO

Article history: Received 31 January 2010 Received in revised form 29 August 2010 Accepted 1 September 2010 Available online 18 September 2010

Keywords: Medical imaging Multiplexers Asynchronous circuits Analog front end X-ray detection

#### ABSTRACT

This paper presents a new power efficient asynchronous multiplexer (MUX) for application in analog front-end electronics (AFE) used in X-ray medical imaging systems. Contrary to typical synchronous MUXes that have to be controlled by a clock, this circuit features a simple structure, as the clock is not required. The circuit dissipates power only while detecting the active signals and then automatically turns back to the power down mode. Medical imaging systems usually consist of several dozen to even several hundreds of channels that operate asynchronously. The proposed MUX enables an unambiguous choice of the active channel. In case of two or more channels that become active at the same time the MUX serializes the reading out data from particular channels. This characteristic leads to 100% effectiveness in data processing and no impulses' loss. The proposed MUX together with an experimental readout ASIC has been implemented in the CMOS 0.18  $\mu$ m process and occupies 1100  $\mu$ m<sup>2</sup>/channel area. It works properly in a wide range of the voltage supply in between 0.8 and 1.8 V. Energy consumed during the detection of one active channel is below 1 pJ, while the detection time is about 1 ns.

© 2010 Elsevier Ltd. All rights reserved.

#### 1. Introduction

Electronic signal detection and processing of X-ray images are gaining widespread acceptance due to their inherent benefits of data storage and transmission in a digital format as opposed to conventional X-ray films [1]. At present, most nuclear medical imaging devices use a scintillator–photomultiplier combination to detect X-rays or gamma rays. The scintillator absorbs X or gamma photons that are emitted by radionuclides introduced to the patient's body with pharmaceuticals, and re-emits the energy as visible light. This light is absorbed by a photocathode of the photomultiplier tube (PMT) and re-emitted as a burst of electrons. Further data processing is performed using external analog frontend electronics (AFE).

Due to the multi-step detection process that involves visible light, PMT devices suffer from poor imaging resolution. Recently, these problems were addressed by the fabrication of solid-state detectors that operate at room temperature and convert X-ray photons directly to electrical signals [1]. Charge carriers of these

E-mail addresses: rafal.dlugosz@gmail.com (R. Długosz),

detectors are sensed by an array of pixel electrodes (sensors) and directly processed in the associated AFE that conditions the analog signals received from the array of sensors and then performs an analog-to-digital (A/D) conversion. The ability to use solid-state detectors has enabled a great improvement in the spatial resolution of X-ray based medical imaging techniques.

A block diagram of a typical AFE used in medical imaging applications is realized as a multi-channel specialized integrated circuit (ASIC), and is shown in Fig. 1. Each channel in this system consists of a sensor (S), a charge amplifier (G), a pulse shaping filter (PS) and a peak detector (PD) [2]. The signal processing scheme in each channel starts with the detection of incident radiation by a sensor that generates, as an answer, an equivalent charge amount. The problem faced at this stage is a very small charge, which for 1 keV X-rays is on the level of several dozen aC. The other problem is the random distribution of this charge over time. The amplifier is therefore used to amplify this charge and to integrate it in an embedded capacitor. If this integration is fast enough, the PS block receives the voltage, which approximately is the Heaviside step function. The PS block is realized as a continuous-time band-pass filter, which converts this step function into a pulse with a given peaking time and the amplitude that is linearly proportional to the value of the step voltage. The subsequent PD determines precisely the peaking time, which is necessary to catch the peak's value, and finally to set up the FLAG

<sup>\*</sup> Corresponding author at: University of Technology and Life Sciences, Faculty of Telecommunication and Electrical Engineering, ul. Kaliskiego7, 85-796 Bydgoszcz, Poland. Tel.: +48 668 160 217.

pierre-andre.farine@epfl.ch (P.-A. Farine), iniewski@ieee.org (K. Iniewski).

<sup>0026-2692/\$ -</sup> see front matter  $\circledast$  2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.mejo.2010.09.006

R. Długosz et al. / Microelectronics Journal 42 (2011) 33-42



Fig. 1. A typical front-end ASIC for multi-element detectors [4].

signal, which is a request directed to the output MUX to read data. Since pulses in particular channels appear asynchronously, a special multiplexing that serializes the tasks is required. After reading the data the MUX block reset the corresponding FLAG signal and a given channel is ready to detect the next event.

In the literature one can find several AFE readout systems that differ in the structure of particular blocks. The most advanced solutions of this type with 32 asynchonously operating channels have been described in [3,4,22]. In the chip described in [3,4] two 16-channels preamplifier-shaper ASICs produce unipolar pulses with  $1.2 \,\mu s$  width, which are provided to a self-switched MUX (SSM). The SSM chip consists of a comparator bank, 32:1 switch matrix and arbitration logic. The SSM detects above-threshold inputs and routes them to the output FIFO structure (first input first output). It also presents the 5-bit address of the selected channel at the output. In the circuit described in [3,4] no collision mechanism has been applied, making this solution generally sufficient in cases when impulses occur seldom in time. If two or more impulses arrive near in time (simultaneous events), the impulses overlap and a collision occurs in the system, since several channels try to connect to the output at the same time. As a result, a portion of all detected impulses may be lost. This problem might be important in some applications and therefore the authors of [3,4] developed the Simultaneous Events Catcher (SEC) that prevents the collisions of the simultaneous events. The solution described in [22] is the first known circuit addressing this problem.

Another AFE realized in the CMOS 0.18 µm technology has been reported in [5]. This system is composed of 64 analog channels and a synchronous analog MUX controlled by a multiphase clock. This clock cyclically, in the loop, checks all the channels. This approach eliminates the collision problem from the systems, but the MUX has to be active the whole time. This approach limits the maximum data rate of a single channel to  $f_S/M$ per second, where  $f_S$  is the sampling frequency of the MUX, while *M* is the number of channels. Moreover, the clock makes the overall circuit much more complex. Looking from both the power dissipation and the chip area points of view this approach is not the most optimal. The MUX occupies an area of more than 2.5 mm<sup>2</sup>, although 66% of this area is occupied in this case by differential amplifiers used to compensate the noise.

The AFE in medical imaging applications can also be realized without the explicit MUX block. In the system with 32 channels reported in [6] instead of using the MUX, followed by a high data rate analog-to-digital converter (ADC), as shown in Fig. 1, each channel has its own converter, while the outputs of the channels are shortened together. A disadvantage of this approach is the large number of ADC blocks that significantly enlarges the chip area. This system in the CMOS  $0.25 \,\mu\text{m}$  technology occupies

42 mm<sup>2</sup> with about 20% of this area occupied by the ADCs, which are in this case 10-bits charge-redistribution successive approximation (SAR) converters. In this type of converters capacitors occupy a large chip area [7,8]. On the other hand, the advantage of this approach is that it eliminates the bottleneck problem that is caused by insufficiently fast single ADC at the output of typical AFE systems. In systems with a large number of channels, in which the number of events is also large, this is an important feature. In such systems the possible solution could be a mixed one with intermediate number (*K*) of ADCs, and *K* MUXes each with *M*/*K* inputs. Such an approach is to be explored in the future.

In this paper we focus on the last element in the signal processing chain, i.e. the MUX circuit, which in this approach has been designed as a fully asynchronous circuit. The proposed circuit enables detection of the active channel in less than 1 ns. The circuit has a built-in de-randomization mechanism that eliminates collisions in the system even if all channels become active at the same time. Since the proposed circuit operates in an asynchronous fashion, it can be viewed as an alternative solution for the solution reported earlier in [22]. In [22] the block preventing the collisions is placed at the input of the system. This block directs the input signals to one of the eight channels containing peak detectors and time-to-amplitude converters (PD/TAC), which is available at the moment, i.e. does not process any signal. As a result, the system is able to handle eight simultaneous events at the same time. In our circuit the collisionpreventing block is included in the MUX, which is the last block in the system. In this case particular channels operate independently. The proposed system is able to catch the number of impulses equal to the number of channels, as each channel contains its own PS and PD blocks. The bottleneck in this system is the output ADC. Comparing both these solutions the proposed circuit is able to catch more simultaneous events, but on the other hand the solution described in [22] requires smaller number of the PD blocks (eight in this case), which are shared between all channels.

The paper is organized as follows. An overview of typical MUX architectures is described in the next section. Section 3 is devoted to the proposed multiplexer realized in the CMOS technology. A verification of the conscience mechanism by means of detailed postlayout circuit level simulations is presented in Section 4. Finally the conclusions are covered in Section 5.

#### 2. Multiplexing circuits—an overview

Multiplexers are widely used in various fields of industry, mostly in telecommunication and also in medical applications, in nuclear physics and others. A manifold of different multiplexing circuits, both digital and analog, have been reported in the literature [9–19]. So far in most cases synchronous solutions were used that are controlled by an arbitrary clock circuit that determines the multiplexing data sequence. On the other hand, asynchronous solutions are useful in those applications where data occur randomly at the inputs, e.g. in nuclear medicine.

Since the application of the proposed MUX significantly differs from that of the synchronous circuits, we do not focus in detail on particular state-of-the-art solutions, presenting only those information that are necessary to place our circuit in a proper perspective.

Several typical MUX architectures are usually distinguished [11]. The most popular of them are based either on the shift register concept [9], the multi-phase approach also known as a single-stage [5,10,19] or the binary-tree (BT) structures [11–18]. The first two are straightforward structures, which are able to handle an arbitrary number of inputs, which is one of the

advantages. Unfortunately in the shift register-type MUXes all circuit components operate at the highest rate, which usually is the source of high power dissipation [11]. On the other hand, the

multi-phase solutions feature a simple structure and require a low frequency multi-phase clock, which is an advantage but suffer from large capacitive load that limits the maximum sampling



Fig. 2. The proposed binary-tree asynchronous multiplexer: (a) the general structure, (b) a single active channel detection block—ACDB and (c) the circuit that determines the address of the active channel.

frequency. The binary-tree MUXes to some degree overcome the limitations of the first two architectures. The capacitive load is in this approach much smaller than in the case of the multi-phase



**Fig. 3.** A prototype AFE chip implemented in the TSMC 0.18  $\mu$ m technology. The MUX block composed of seven active channel detection blocks (ACDB) occupies an area of  $300 \times 30 \ \mu$ m<sup>2</sup>.

structures, so it can operate at higher speed. In the binary-tree MUXes particular layers in the tree operate at different frequencies. Since only some blocks operate at the highest speed the power dissipation is therefore reduced. A disadvantage of this solution is that it requires a multi-rate clock, in which distribution of different frequency clock signals has to be precisely controlled. Despite its disadvantages this approach recently is the most frequently used [11–18].

Considering analog data processing, e.g. in medical imaging applications, the most reasonable MUX structure that can be applied is the multi-phase one. In the other two cases input data need to be copied several times in the structure, e.g. between layers in the tree. In case of analog signals the copying process would be the source of large errors. In the binary-tree structures, for example, data are copied  $\log_2 M$  times, while in the shift register MUX even M times. This is the reason for using only multi-phase structures in medical imaging systems [3,5]. While in all these cases a similar structure is used in the analog signal path, with only some modifications, an essential difference exists in the structure of the channel selection circuit that always is a digital block. In the AFE described in [5] the loop selection sequence is imposed by the arbitrary clock. Since this is a typical multi-phase scheme, the MUX has to be oversampled at least M times, in comparison with the data rate of particular channels, to ensure a proper reading out of all data. We have proposed guite a different approach, which can be referred to as the asynchronous one. The analog signal processing path is similar to those used in [3–5], while the channel selection circuit is based on the binary-tree concept. In this approach, to detect any active channel, the most log<sub>2</sub> M switching operations (steps) in the tree are required, but since in some cases no switching occurs, the average number of



**Fig. 4.** Transistor level simulations for the worst case scenario, in which all ACDBs between a given input and the output are turned over, for V<sub>DD</sub>=1.8 V: (a) the flag signals, F, at particular layers in the tree and a corresponding address, S, (b) the clock enables signals, EN, in ACDBs and (c) the total current consumption.

R. Długosz et al. / Microelectronics Journal 42 (2011) 33-42



**Fig. 5.** similar results as shown in Fig. 4 for  $V_{DD} = 0.8$  V.

switching operations equals  $0.5 \log_2 M$ . These steps are started automatically with a delay of only two logical gates and without the arbitrary clock, which allows for achieving very small detection times and consequently relatively high data rates.

#### 3. The proposed asynchronous multiplexer

The proposed MUX has been optimized for application in AFE that operates asynchronously. The role of the proposed circuitry is to detect the event of data appearing in one of the *M* input channels, and to establish a connection between the given active input and the MUX output in order to read out data. Upon detection, the MUX copies the peak of the analog impulse stored in the sample-and-hold (S&H) cell to the output stage, and determines the address of a given active channel. The general MUX architecture used in this application is fairly typical (a tree-type structure), but mechanisms used to detect the active channels proposed here are novel. The binary-tree structure ensures an unambiguous selection of only one path between a given MUX input and the output.

One of the most important innovations is that the proposed MUX does not require an external clock and operates fully asynchronously. As a result, for all channels being inactive the circuit is in the power down mode and is automatically activated when new data occur at any input. Since the circuit is composed of CMOS elements, in the standby mode it consumes a negligible power of several nW only. One of the features introduced in our circuit, which typically is not used in synchronous MUXes, is the mechanism that allows detecting the address of the channel that became active. This information is relevant in medical diagnostics applications as it provides information on the pixel address that has received X-ray photon.

One of the introduced innovations is the ability to solve the problem of collisions between events. Even if two or more channels become active at the same time, the circuit properly serializes access to the MUX output so that no data will be lost. As long as a given active channel is being read out, data in other channels are held and these channels are not allowed to detect new events. After reading out data a given channel is reset (its flag becomes logical '0'), while the MUX automatically turns to another active channel.

The general structure of the proposed MUX is shown in Fig. 2(a). It is composed of M-1 active channel detection blocks (ACDB), shown in detail in Fig. 2 (b). The MUX can be in several states. In particular ACDB input signals are flags (F) received either directly from the corresponding channels, in the case of the bottom layer in the tree, or from the preceding layers. If all flags are logical '0', which means that all channels are inactive, the overall structure is in the standby mode. Each ACDB contains an asynchronously started 2-phase clock that is composed of a D-flip flop (DFF). The operation principle of this concept has been explained for an example case of M=4, i.e. for two layers in the tree and two ACDBs at the bottom layer, namely ACDB11 and ACDB12, and one (ACDB21) at the top. The example sequence is as follows:

1. For all flags F11–F14 being '0', signals X11–X14 are also '0' and the corresponding signals Y are logical high. Similarly, flags

F21 and F22 are zero, signals X21 and X22 are zero, as well as clock enable EN in each ACDB is zero. In this case the tree is turned off.

- 2. As a signal appears in channel 1 its corresponding flag F11 changes to logical high. As a result the F21 flag changes to logical high as well. At this point there are two options:
  - 2.1 clk1 was logical low (clk2 was logical high) at the moment that the signal appeared in channel 1 and the corresponding F11 flag was set. As a result, outputs of A1 and A2 in ACDB11 are still zero, while Y is logical high, and since F21 is one, EN becomes logical high starting the clock for the ACDB11 pair. After one clock cycle clk1 becomes logical high and the X11 signal becomes logical high, which stops the clock.
  - 2.2 clk1 was one at the time the signal appeared in channel 1 and the corresponding F11 flag was set. In this case A1 output is at one that causes Y21 to change value to zero and the enable signal EN in ACDB11 remains zero. In consequence, the clock was not started as desired.
- 3. Almost parallel with the operation described in point 2 above the ACDB21 circuitry at the second level is activated as the F21 flag becomes logical high after only one OR delay. Subsequently, that circuit operates exactly in the same fashion as at the lower level. The asynchronous operation plays in this

case the main role, as even in the case of larger numbers of layers in the tree the proposed MUX is able to establish the connection between the given input and the output after time that is equal to a delay of several OR gates only. Once the connection is established the MUX is further in the standby mode.

4. It is important to point out that the clock in any ACDB is automatically started only in a situation where one channel is active and the clock is not pointing at this channel. In the case when both flags F11 and F12 are active at the same time and connection between one of these inputs and the MUX output is established, the second flag must wait until the first flag is reset. This is one of the main advantages of the proposed solution as even in the worst case scenario, when all flags are logical high at the same time, access to the output is always limited to only one input, while other channels must wait. This prevents collisions at the MUX output, as mentioned above.

The access to the MUX output is controlled by signals  $S_x$  that control the switches in the analog paths (see Fig. 2(a)). These signals depend on signals X from particular layers in the tree, as shown in Fig. 2(c). This concept features a very simple structure. Each ACDB block consists of only one DFF and several logic gates, thus occupying a very small chip area.



**Fig. 6.** Performance characteristics of the proposed asynchronous MUX: (a) energy consumed during detection of a single active channel vs. sampling frequency,  $f_{s}$ , for selected supply voltages, (b) Performance Index (PI) defined as  $f_s$  over energy per event vs. achievable data rate for different values of  $V_{DD}$ .

## 4. Implementation of the proposed multiplexer in the CMOS technology

The proposed MUX has been applied in a prototype AFE with eight analog channels realized in the CMOS 0.18  $\mu$ m process, as shown in Fig. 3. Other system components, i.e. the PS filter and the PD with an S&H memory cell have been described by the authors in [20,21]. The proposed MUX occupies an area of 0.009 mm<sup>2</sup>, i.e. 1100  $\mu$ m<sup>2</sup> per channel.

To illustrate the MUX performance selected transistor level simulation results are shown in Figs. 4, 5 and 7. The MUX can operate in a wide range of the supply voltage. Figs. 4 and 5 show that the supply voltage only has an influence on the circuit speed, while it does not affect its functionality. Comparing both these cases one can see that the power dissipation is in the first case higher by two orders of magnitude than in the second case but since the circuit is ca. 10 times faster in this case, the energy consumption per single detection operation increases only moderately.

A comparative study for different values of the supply voltage is shown in Fig. 6. Defining the performance index (PI) as a ratio between data rate and the energy that is consumed by the MUX during detection of a single event, it can be seen that the most optimal case is for  $V_{DD}$  = 1 V. For higher supplies we increase data rate but at the expense of energy consumption that increases much faster. On the other hand, reducing the supply voltage allows for decreasing the energy consumption per single event but data rate decreases faster, which is also not the optimal case.

The results shown in Figs. 4 and 5 are for only one channel being active at any time. This situation allows us to determine the maximum delay introduced by the MUX, which is measured as a period between setting up the flag at the output of a given channel and a corresponding S signal (the address). The maximum data rate that exceeds 1 GHz is for  $V_{DD}=1.8$  V. In this case the energy consumed per single event is the largest but is still below 1 pJ, which is much lesser than in synchronous MUXes reported in the literature.

The results in Fig. 7 are shown for the worst case scenario, for all channels being active at the same time. This case demonstrates how the proposed de-randomization mechanism operates. Fig. 7 illustrates two cycles, each starting with the occurrence of all eight flags at the MUX inputs. In this case, in the first cycle the 3rd channel is read out as first. After the flag F13 is reset the ACDB12



Fig. 7. Simulation results for all channels being active at the same time: (a) the input flag signals, (b) the EN signals in particular ACDBs, (c) the corresponding address signals and (d) total current consumption.

block immediately switches to F14. As the flag F22 is still '1' at this time, the clock in ACDB21 remains inactive. After reading out the 4th channel only ACDB21 turns over and the 1st channel is read out as next followed by reading out the 2nd channel. After that ACDB31 turns over to group F15–F18 and the process continues until all channels are read out. Note that after the first cycle all ACDBs remember their last settings, which is the reason for the opposite reading sequence at the next cycle. Looking from the power dissipation point of view, this case is the most optimal as only one ACDB turns over for each input data.

The proposed MUX can also be potentially used as a synchronous circuit, as shown in Fig. 8(a) for an example case of 8 inputs. In this configuration, additional 2-input AND gates need to be used at the MUX inputs. The clock signals (clk1,..., clk8) are applied to one input of the AND gates, while transmitted data (In1,..., In8) to the others. In this case flag F41 becomes the MUX output signal. In this configuration both the X and the S signals are not used, since there is no need to determine the addresses of the channels. Although in comparison with the stateof-the-art solutions the data rate of this MUX is lower (about 1.5 GHz), but energy per single bit is below 0.3 pJ. This is two orders of magnitude less than in the circuit described in [17] also realized in the 0.18 µm CMOS process. Simulation results in the case of the synchronous configuration are present in Fig. 8(b). The input signals are sampled at the rate of 0.19 GHz, while the output at 1.51 GHz. One of the advantages of this solution is that a multirate clock that is typically used at particular layers in synchronous binary-tree MUXes is not required in this case. This significantly simplifies the circuit structure and reduces power dissipation.

Since the proposed MUX is going to be used in a commercial AFE, careful tests on process, voltage and temperature (PVT) variation have been undertaken. The results for  $V_{DD}$ =1.8 V as well as for 0.8 V are shown in Fig. 9 for temperatures ranging between –40 and 100 °C, and for several representative transistor models, namely typical-typical (TT), fast-fast (FF) and slow-slow (SS) models. As can be seen these parameters influence only the achievable data rate. The presented results show that the most optimal supply voltage is 1 V; as in this case there is the best energy usage (see Fig. 6) and data rate is stable over a wide range of environmental temperatures.

The results presented above are shown for eight channels, i.e. for three layers in the tree. To illustrate the circuit performance in a wider perspective, Fig. 10 presents the maximum achievable data rate as a function of the number of inputs. The results shown for two, four and eight inputs have been obtained during transistor level simulations, while the results for larger numbers of inputs have been calculated. The BT solution provides an important advantage here. Even for large number of channels the number of layers in the tree is small, e.g. 8 for 256 channels, so data rate is not significantly limited in that case. The maximum data rate can be calculated as follows:

$$f_{S_{max}} = f_{S_{1}ACDB} / \log_2 M[Hz]$$
(1)

In the formula above  $f_{S_{-1ACDB}}$  is the maximum data rate of a single ACDB, while *M* is the number of MUX inputs. Note that doubling the number of inputs always adds only one layer to the tree, which makes the data rate decrease rather moderately with the number of inputs. This is an important advantage in



**Fig. 8.** The proposed MUX used in the synchronous mode for the input data rate of 0.19 and 1.52 GHz at the output: (a) a test structure controlled by an 8-phases clock and (b) the input and output signals. The results are presented for  $V_{DD}$ =1.8 V and a room temperature of 20 °C.

R. Długosz et al. / Microelectronics Journal 42 (2011) 33-42



**Fig. 9.** Corner analysis of the circuit performance—maximum data rate,  $f_{\rm S}$ , over the environmental temperature for: the slow, typical and fast transistor models, for an example case of eight inputs and three layers in the tree, for: (a)  $V_{\rm DD}$ =1.8 V and (b)  $V_{\rm DD}$ =0.8 V. For  $V_{\rm DD}$  of 1.0 V particular waveforms are approximately flat at the levels of 0.267, 0.35 and 0.53 GHz, for SS, TT and FF models respectively.

comparison with other MUXes used in the AFE chips described above.

Fig. 10 illustrates also the results for the synchronous mode. In this case addresses of the channels do not need to be determined, which makes the circuit significantly faster than in the asynchronous mode and more power efficient.

In this paper we have presented only the simulation results of the MUX. The main reason is that this circuit has been used as one of the components of the AFE chip, in which the measurement of the MUX as a separate block is not possible. Measurement results of this particular chip do not provide exact information on the achievable data rates, since particular analog channels in the system operating at much lower sampling frequencies are the bottleneck here. In our opinion the simulation results in this particular case can be viewed as a good estimation of MUX performance, as this block, being a simple feed-forward structure, is composed of only digital elements. This additionally has been confirmed by means of detailed corner analysis, as presented above.

#### 4.1. A comparative study with other multiplexing circuits

Performance comparison between reported MUX structures is provided in Table 1. Since the most commonly used solutions are



**Fig. 10.** Maximum data rate vs. number of inputs for 1.8 and 0.8 V supply voltages for the MUX working either in the synchronous or the asynchronous mode. The results are presented for the TT transistor model and a room temperature of 20 °C.

the synchronous ones, used for multiplexing digital data, in our solution we have pointed out the active channel selection time as this is the comparable parameter. We should remember that if the MUX is used in the AFE, a significantly larger time is required to read out and then to reset the channel. For a meaningful comparison a Figure-of-Merit (FOM) has been defined as a ratio of a normalized sampling frequency to dissipated power per single input:

$$FOM = \underbrace{\frac{\int_{NORM}}{(f_s/\log_2 M)}}_{(P/M)} [GHz/mW]$$
(2)

To make the reported sampling frequencies comparable we have normalized it to a single layer only (in case of BT solutions), since each layer usually introduces an approximately equal delay.

One of the most important reported parameters is achievable data rate. As shown in Table 1 the circuits implemented in the CMOS 0.18  $\mu$ m process, i.e. in the technology comparable with our case, can be as fast as 5–10 GHz (for 8 inputs) but at the expense of much larger power dissipation and larger chip area. On the other hand, since the application of our circuit is specific, we focused rather on power dissipation, as well as circuit complexity. The MUX has been used in the AFE, in which reading out a single channel takes ca. 100 ns. In this case a very large data rate is not the most important parameter. Nevertheless, the sampling frequency of ca. 1 GHz is not a significantly worse result, while we achieved a much better FOM.

An important advantage of our solution is also a very good matching of energy consumption to the values of the input data. If in the asynchronous mode no events are present at the channels inputs (i.e. all flags are zero) or in the synchronous mode most input signals are equal in a given period of time, regardless of their values, the circuit operates at low power dissipation or is in the power down mode, since most or all ACDBs do not turn over in this case. This makes the proposed circuit very useful in various portable devices or in wireless sensor network (WSN) applications, in which energy consumption is one of the main parameters.

#### 5. Conclusions

A novel binary-tree multiplexer with ultra-low power, asynchronous selection circuit has been proposed and realized in the

#### R. Długosz et al. / Microelectronics Journal 42 (2011) 33-42

| Table 1                                          |
|--------------------------------------------------|
| Performance comparison of selected MUX circuits. |

| Ref.      | Process [µm] | $V_{\rm DD}$ [V] | М  | <i>P</i> [mW] | fs [GHz] | Туре             | Area [mm <sup>2</sup> ] | FOM [GHz/mW] |
|-----------|--------------|------------------|----|---------------|----------|------------------|-------------------------|--------------|
| [11]      | 0.18         | ND               | 8  | 50            | 5        | BTS <sup>a</sup> | 0.9                     | 0.27         |
| [12]      | 0.12         | 1.3              | 4  | 105           | 15       | BTS              | 0.66                    | 0.29         |
| [3]       | 0.18         | 2                | 16 | 30            | 3.6      | SR <sup>b</sup>  | > 2                     | 0.48         |
| [2]       | 0.15         | 2                | 8  | 118           | 3        | MP <sup>c</sup>  | > 2                     | 0.07         |
| [15]      | 0.18         | 1.8              | 16 | 24            | 1.65     | BTS              | 0.858                   | 0.28         |
| [16]      | 0.18         | 1.8              | 16 | 36.2          | 2        | BTS              | 0.78                    | 0.22         |
| [17]      | 0.18         | 1.8              | 8  | 30            | 5        | BTS              | 0.029                   | 0.44         |
| [14]      | 0.18         | 2.2              | 8  | 112           | 10.2     | BTS              | 0.13                    | 0.24         |
| [13]      | 0.18         | 1.5              | 2  | 40            | 40       | BTS              | 0.30                    | 2.00         |
| [18]      | 0.18         | 2                | 2  | 110           | 15       | BTS              | 0.11                    | 0.27         |
| This work | 0.18         | 1.8              | 8  | 1.08          | 1        | BTA <sup>d</sup> | 0.009                   | 2.47         |
| This work | 0.18         | 0.8              | 8  | 0.0096        | 0.1      | BTA              | 0.009                   | 27.78        |
| This work | 0.18         | 1.8              | 8  | 0.51          | 1.52     | used as BTS      | 0.009                   | 7.95         |

<sup>a</sup> Binary tree synchronous.

<sup>b</sup> Shift register.

<sup>c</sup> Multi-phase.

<sup>d</sup> BT asynchronous.

CMOS 0.18  $\mu$ m process. The proposed circuit has been designed as an important part of the readout front-end ASIC for multi-element detectors in medical imaging, where input data appear in random fashion in the input channels. The proposed MUX offers an alternative solution for a collision preventing circuit proposed in [22].

A de-randomization block included in the MUX automatically finds out those channels, which became active, i.e. detects the peaks of the impulses, then holds the information on the amplitudes of these peaks in analog memory cells of particular channels as long as they will be read out by the output stage and performs a reset of these channels at the end. One of the important purposes of the proposed circuit is full elimination or at least strong limitation of such situations, in which one or more input signals attempt to connect to the output stage at the same time. In the proposed circuit this feature does not depend on the number of simultaneous events, even in the case of large number of parallel channels.

The proposed circuit enables a simple standby mode when no data arrives at its input. When detecting a signal the energy used to detect one active channel is below 1 pJ (for 8 inputs). The circuit is able to work with an input data frequency of about 1 GHz, which is sufficient with a large margin in designed medical imaging application where the data rate will be at the level of 10 MHz.

#### References

- K. Iniewski (Ed.), Medical Imaging: Principles, Detectors, and Electronics, Wiley, 2009.
- [2] H. Spieler, Semiconductor Detector Systems, Oxford University Press, 2005. [3] G. De Geronimo, A. Kandasamy, P. O'Connor, Analog peak detector and
- (a) De Gerommo, A. Kandasany, F. O'Connol, Analog peak detector and derandomizer for high-rate spectroscopy, IEEE Transactions on Nuclear Science 49 ((4) Pt 1) (2002) 1769–1773.
- [4] G. De Geronimo, P. O'Connor, J. Grosholz, A generation of CMOS readout ASIC's for CZT detectors, IEEE Transactions on Nuclear Science 47 (2000) 1857–1867.
- [5] M. Zoladz, P. Grybos, M. Kachel, P. Kmon, R. Szczygiel, Analogue multiplexer for neural application in 180 nm CMOS technology, in: Proceedings of the Sixteenth International Conference Mixed Design of Integrated Circuits & Systems (MIXDES), 2009, pp. 230–233.
  [6] G. Mazza, A. Rivetti, G. Anelli, F. Anghinolfi, M.I. Martinez, F. Rotondo, A
- [6] G. Mazza, A. Rivetti, G. Anelli, F. Anghinolfi, M.I. Martinez, F. Rotondo, A 32-channel, 0.25 μm CMOS ASIC for the readout of the silicon drift detectors of the ALICE experiment, IEEE Transactions on Nuclear Science 51 ((5), Pt 1) (2004) 1942–1947.

- [7] A. Rivetti, G. Anelli, F. Anghinolfi, G. Mazza, F. Rotondo, A low power 10-bit ADC in a 0.25 μm CMOS: design considerations and test results, IEEE Transactions on Nuclear Science 48 (2001) 1225–1228.
- [8] N. Verma, A.P. Chandrakasan, A 25 μW 100 kS/s 12b ADC for wireless microsensor applications, in: Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2006, pp. 822–831.
- [9] M. Kurisu, M. Kaneko, T. Suzaki, A. Tanabe, M. Togo, A. Furukawa, T. Tamura, K. Nakajima, K. Yoshida, 2.8Gb/s 176-mW byte-interleaved and 3.0-Gb/s 118mW bit-interleaved 8:1 multiplexers with a 0.15µm CMOS technology, IEEE Journal of Solid-State Circuits 31 (12) (1996) 2024–2029.
- [10] T. Nakura, K. Ueda, K. Kubo, Y. Matsuda, K. Mashiko, T. Yoshihara, 3.6-Gb/s 340-mW 16:1 pipe-lined multiplexer using 0.18 μm SOI-CMOS technology, IEEE Journal of Solid-State Circuits 35 (5) (2000) 751–756.
- [11] Hung-Wen Lu, Chau-Chin Su, A 5 Gbps CMOS LVDS transmitter with multiphase tree type multiplexer, in: Proceedings of the IEEE Asia-Pacific Conference on Advanced System Integrated Circuits, 2004, pp. 228–231.
- [12] D. Kehrer, H.-D. Wohlmuth, H. Knapp, A.L. Scholtz, A 15 Gb/s 4:1 parallel-to-serial data multiplexer in 0.12 μm CMOS, in: Proceedings of the 28th European Solid-State Circuits Conference, 2002, pp. 227–230.
   [13] D. Kehrer, H.-D. Wohlmuth, H. Knapp, M. Wurzer, A.L. Scholtz, 40-Gb/s 2:1
- [13] D. Kehrer, H.-D. Wohlmuth, H. Knapp, M. Wurzer, A.L. Scholtz, 40-Gb/s 2:1 multiplexer and 1:2 demultiplexer in 120-nm standard CMOS, IEEE Journal of Solid-State Circuits 38 (11) (2003) 1830–1837.
- [14] A. Tanabe, M. Umetani, I. Fujiwara, T. Ogura, K. Kataoka, M. Okihara, H. Sakuraba, T. Endoh, F. Masuoka, 0.18-µm CMOS 10-Gb/s multiplexer/ demultiplexer ICs using current mode logic with tolerance to threshold voltage fluctuation, IEEE Journal of Solid-State Circuits 36 (6) (2001) 988–996.
- [15] K. Short, N.T. Trung, Soo-Won Kim, Jae-Tack, Yoo, A reliable static-logic-based 16:1 binary-tree multiplexer in 0.18 μm CMOS, in: Proceedings of the 50th Midwest Symposium on Circuits and Systems (MWSCAS), 2007, pp. 1193–1196.
  [16] X. Tang, X.J. Wang, S.Y. Zhang, Y.S. Chi, N. Jiang, F.Y. Huang, A 2-Gb/s 16:1
- [16] X. Tang, X.J. Wang, S.Y. Zhang, Y.S. Chi, N. Jiang, F.Y. Huang, A 2-Gb/s 16:1 multiplexer in 0.18-μm CMOS process, in: Proceedings of the International Conference on Microwave and Millimeter Wave Technology (ICMMT), vol. 2, 2008, pp. 868–870.
- [17] Hungwen Lu, Chauchin Su, Chien-Nan Jimmy Liu, A tree-topology multiplexer for multi-phase clock system, IEEE Transactions on Circuits and Systems I: Regular Papers 56 (1) (2009) 124–131.
- [18] Jun-Chau Chien, Liang-Hung Lu, A 15-gb/s 2:1 multiplexer in 0.18 µm CMOS, IEEE Microwave and Wireless Components Letters 16 (10) (2006) 558–560.
- [19] C.K. Yang, M.A. Horowitz, A 0.8-µm CMOS 2.5 Gb/s Oversampling Receiver and Transmitter for Serial Links, IEEE Journal of Solid-State Circuits 31 (12) (1996) 2015–2023.
- [20] R. Długosz, K. Iniewski, High precision analog peak detector for X-ray imaging applications, Electronics Letters 43 (8) (2007) 440-441.
- [21] R. Diugosz, R. Wojtyna, Novel CMOS analog pulse shaping filter for solid state X-ray sensors in medical imaging systems, in: E. Kącki, M. Rudnicki, J. Stempczyńska (Eds.), Computers in Medical Activities, Book series: Advances in Intelligent and Soft Computing, ISSN: 1615-3871, ISBN: 978-3-642-04461-8, vol. 65/2009, Springer-Verlag, Berlin/Heidelberg, 2009 pp. 155-165, (Chapter 16).
- [22] A. Dragone, G. De Geronimo, et al., The PDD ASIC: highly efficient energy and timing extraction for high-rate applications, in: Proceedings of the IEEE Nuclear Science Symposium, 2005, pp. 914–918.