CMS data reproduction with GAN

Author: Jeong Hwa Kim (Korea University, Department of Physics)

Based on the lectures done by Suyong Choi (Korea University, Department of Physics) at QUC school, KIAS

Analysis scheme

From the CMS dataset, we will try to identify the mass and decay width of the Z boson decaying through the muon channel. This is done by applying appropriate selections to the data and extracting the muons that are likely to originate from our event of interest. We then reproduce the same analysis with GAN generated data.

About the CMS data

The CMS data we will look at is from CMS masterclass tutorial in http://ippog.org/resources/2012/cms-hep-tutorial. The data was collected by the CMS experiment in 2011 in proton-proton collisions using single muon trigger at center-of-mass energy $\sqrt{s}=8\ \mathrm{TeV}$. The trigger requires at least one muon with $p_T>24\ \mathrm{GeV}$. The integrated luminosity of the data corresponds to $50\ \mathrm{pb}^{-1}$. The various Monte Carlo simulated data contains event weight with which, histograms should be filled in order to compare with the real data.

References

[1] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.

[2] de Oliveira, Luke, Michela Paganini, and Benjamin Nachman. "Controlling physical attributes in gan-accelerated simulation of electromagnetic calorimeters." Journal of Physics: Conference Series. Vol. 1085. No. 4. IOP Publishing, 2018.

[3] Farrell, Steven, et al. "Next Generation Generative Neural Networks for HEP." EPJ Web of Conferences. Vol. 214. EDP Sciences, 2019.

Reading ROOT file and accessing information with uproot

In [108]:
# imports
import uproot
import numpy as np
from scipy.signal import chirp, find_peaks, peak_widths, savgol_filter
import matplotlib.pyplot as plt
In [109]:
dataroot = uproot.open('data.root')
In [110]:
# access the tree that is named "events"
data_tree = dataroot['events']
In [111]:
# converting to arrays all at once
alldata = data_tree.arrays()
In [112]:
# see the contents
alldata
Out[112]:
{b'NJet': array([0, 0, 0, ..., 0, 0, 0], dtype=int32),
 b'Jet_Px': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1828>,
 b'Jet_Py': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1748>,
 b'Jet_Pz': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1940>,
 b'Jet_E': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1e48>,
 b'Jet_btag': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1630>,
 b'Jet_ID': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1710>,
 b'NMuon': array([1, 1, 0, ..., 0, 1, 1], dtype=int32),
 b'Muon_Px': <JaggedArray [[4.859496] [-23.376602] [] ... [] [61.81493] [28.932827]] at 0x7f01e27d1860>,
 b'Muon_Py': <JaggedArray [[-30.239874] [-21.324467] [] ... [] [-4.7878866] [1.5360887]] at 0x7f01e27d1a58>,
 b'Muon_Pz': <JaggedArray [[137.77649] [-21.486786] [] ... [] [90.52164] [75.29765]] at 0x7f01e27d1d30>,
 b'Muon_E': <JaggedArray [[141.13979] [38.247765] [] ... [] [109.71867] [80.67971]] at 0x7f01e27d1438>,
 b'Muon_Charge': <JaggedArray [[-1] [1] [] ... [] [1] [1]] at 0x7f01e27d17f0>,
 b'Muon_Iso': <JaggedArray [[0.0] [0.5189069] [] ... [] [0.0] [0.0]] at 0x7f01e27d1978>,
 b'NElectron': array([0, 0, 0, ..., 0, 0, 0], dtype=int32),
 b'Electron_Px': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1b70>,
 b'Electron_Py': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1c18>,
 b'Electron_Pz': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1550>,
 b'Electron_E': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1908>,
 b'Electron_Charge': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1a90>,
 b'Electron_Iso': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1c88>,
 b'NPhoton': array([0, 0, 0, ..., 0, 0, 0], dtype=int32),
 b'Photon_Px': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1b00>,
 b'Photon_Py': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1668>,
 b'Photon_Pz': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1a20>,
 b'Photon_E': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1ba8>,
 b'Photon_Iso': <JaggedArray [[] [] [] ... [] [] []] at 0x7f01e27d1da0>,
 b'MET_px': array([ -8.674409,  26.828722, -14.29473 , ...,  19.300333, -33.147465,
        -15.614685], dtype=float32),
 b'MET_py': array([ 21.8988   ,  29.009142 ,   8.710307 , ...,  -4.5646696,
        -10.760271 ,  22.061094 ], dtype=float32),
 b'MChadronicBottom_px': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MChadronicBottom_py': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MChadronicBottom_pz': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MCleptonicBottom_px': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MCleptonicBottom_py': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MCleptonicBottom_pz': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MChadronicWDecayQuark_px': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MChadronicWDecayQuark_py': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MChadronicWDecayQuark_pz': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MChadronicWDecayQuarkBar_px': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MChadronicWDecayQuarkBar_py': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MChadronicWDecayQuarkBar_pz': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MClepton_px': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MClepton_py': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MClepton_pz': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MCleptonPDGid': array([0, 0, 0, ..., 0, 0, 0], dtype=int32),
 b'MCneutrino_px': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MCneutrino_py': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'MCneutrino_pz': array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
 b'NPrimaryVertices': array([ 7,  7,  2, ..., 11,  7,  9], dtype=int32),
 b'triggerIsoMu24': array([ True,  True,  True, ...,  True,  True,  True]),
 b'EventWeight': array([1., 1., 1., ..., 1., 1., 1.], dtype=float32)}
In [113]:
# muon energy
alldata[b'Muon_E'][4]
Out[113]:
array([46.11058], dtype=float32)
In [114]:
# muon information
mupx, mupy, mupz, muE, muiso, nmu, muq =data_tree.arrays(['Muon_Px', 'Muon_Py', 'Muon_Pz', 'Muon_E', 'Muon_Iso', 'NMuon', 'Muon_Charge'], outputtype=tuple)

Muon quantities

  • Transverse momentum: $p_T = \sqrt{p_x^2 + p_y^2}$
  • Azimuth: $\phi = \arctan(p_y/p_x)$
  • Rapidity: $\eta=\frac{1}{2}\log\left( \frac{E+p_z}{E-p_z} \right)$
  • Relative isolation: muon isolation / muon $p_T$
In [115]:
mupt =  np.hypot(mupx, mupy)
muphi = np.arctan2(mupx, mupy)
mueta = 0.5 * np.log((muE+mupz)/(muE-mupz))
mureliso = muiso/mupt

Muon selection

  • High $p_T$
  • $|\eta|<2.4$
  • Relative muon isolation < 0.1
In [116]:
sel_muiso = (mureliso < 0.1)
sel_mupthi = (mupt>25) & (np.abs(mueta)<2.4) & sel_muiso
sel_muptlo = (mupt>15) & (np.abs(mueta)<2.4) & sel_muiso
In [117]:
# N events passing selection
print(muiso[sel_muiso].flatten().size)
print(muiso[sel_mupthi].flatten().size)
print(muiso[sel_muptlo].flatten().size)
240426
227429
238824

Analysis - Z boson decay

Z boson can decay to two muons with opposite charges.

$Z\rightarrow \mu^++\mu^-$

From the sample, we can reconstruct the Drell-Yan production in the dimuon channel. For the trigger, we require asymmetric $p_T$

  • Two isolated muons with $p_T>15\ \mathrm{GeV}$ and $|\eta|<2.4$, one of which should be $p_T>25\ \mathrm{GeV}$
In [118]:
# asymmetric pT selection
twomusel = (mupt[sel_mupthi].counts>=1) & (mupt[sel_muptlo].counts>=2)
In [119]:
# opposite charge selection
oppcharge = (muq[twomusel, 0] * muq[twomusel, 1])<0
In [120]:
# N events passing selection
print(twomusel.size)
print(oppcharge.size)
469384
15403
In [121]:
# muon pT distribution
plt.plot(mupt[twomusel, 0], mupt[twomusel, 1], '.')
Out[121]:
[<matplotlib.lines.Line2D at 0x7f01e2774780>]
In [506]:
# muon isolation
plt.hist(mureliso[twomusel, 0], bins=50)
plt.show()
In [123]:
# Invariant mass of two four vectors
def invmass(E1, px1, py1, pz1, E2, px2, py2, pz2):
    Esum = E1 + E2
    pxsum = px1 + px2
    pysum = py1 + py2
    pzsum = pz1 + pz2
    invm = np.sqrt(Esum**2 - pxsum**2 - pysum**2 - pzsum**2) 
    return invm
In [124]:
twomumass = invmass(muE[twomusel, 0], mupx[twomusel,0], mupy[twomusel,0], mupz[twomusel,0], muE[twomusel, 1], mupx[twomusel,1], mupy[twomusel,1], mupz[twomusel,1])
In [125]:
# draw dimuon invariant mass
fig, ax = plt.subplots(figsize=(6,6))
ax.hist(twomumass, bins=100, range=(70, 100))
plt.show()
In [359]:
# define histogram
masshist = ax.hist(twomumass, bins=100, range=(70, 100))
In [360]:
# smoothen histogram
masssmooth = np.array([savgol_filter(masshist[0], 15, 3), masshist[1][:-1]])
In [361]:
# find peak = zmass
zmass = masssmooth[1][masssmooth[0].argmax()]
In [362]:
zmass #91.19
Out[362]:
90.69999694824219
In [363]:
def lin_interp(x, y, i, half):
    return x[i] + (x[i+1] - x[i]) * ((half - y[i]) / (y[i+1] - y[i]))

def half_max(x):
    half = max(x[0])/2.0
    signs = np.sign(np.add(x[0], -half))
    zero_crossings = (signs[0:-2] != signs[1:-1])
    zero_crossings_i = np.where(zero_crossings)[0]
    return [lin_interp(x[1], x[0], zero_crossings_i[0], half),
            lin_interp(x[1], x[0], zero_crossings_i[1], half)]
In [364]:
zwidth = half_max(masssmooth)
In [505]:
print(zwidth) #2.50
print(zwidth[1] - zwidth[0])
[88.26418830212165, 93.13064194625389]
4.8664536441322355
In [504]:
# draw dimuon invariant mass
fig, ax = plt.subplots(figsize=(6,6))
ax.hist(twomumass, bins=100, range=(70, 100))
ax.plot(masssmooth[1], masssmooth[0], color='0.3')
plt.title('Dimuon invariant mass')
plt.xlabel('Mass (GeV)')
plt.ylabel('Events')
plt.savefig('figures/dimuon.png', dpi=300)

Reproduction with GAN

  1. Parameters

    • dimension of feature inputs (feature_dim)
    • dimension of latent or hidden features (latent_dim)
    • number of critics (ncritics)
    • weight for gradient penalty (lam)
  2. Discriminator

    • Tries to distinguish real data from generated data
    • Multi-layer perceptron type with a single output
  3. Generator

    • Given a batch random numbers (of latent_dim size) it returns a batch of feature_dim dimensions
    • Multi-layer perceptron type with feature_dim number of outputs
  4. Losses

    • Discriminator loss - following is to be minimized via gradient descent \begin{equation}
    • \log D(\vec{x}) - \log [1- D(G(\vec{z})) ] \end{equation}
    • Generator loss \begin{equation} \log [1- D(G(\vec{z})) ] \end{equation}
In [168]:
# imports
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import os
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
  return f(*args, **kwds)
In [169]:
class GAN(keras.Model):
    def __init__(self, feature_dim, latent_dim, netarch=[128, 128], ncritics = 5, **kwargs):
        super(GAN, self).__init__(**kwargs)
        self.indim = feature_dim
        self.zdim = latent_dim
        self.netarch = netarch
        self.ncritics = ncritics
        self.setup()
        pass

    def build_discriminator(self):
        """Builds discriminator

        Returns:
            [keras.Model]: Discriminator Keras Model
        """
        xin = layers.Input(shape=(self.indim,))
        net = xin
        for arch in self.netarch:
            net = layers.Dense(arch, activation=tf.nn.leaky_relu)(net)
        dout = layers.Dense(1, activation=tf.nn.sigmoid)(net)

        discriminatormodel = keras.Model(xin, dout, name='discriminator')
        return discriminatormodel

    def build_generator(self):
        """Builds generator

        Returns:
            [keras.Model]: Generator Keras Model
        """
        zin = layers.Input(shape=(self.zdim,))
        net = zin
        for arch in self.netarch:
            net = layers.Dense(arch, activation=tf.nn.leaky_relu)(net)
        xout = layers.Dense(self.indim, activation=None)(net)

        generatormodel = keras.Model(zin, xout, name='generator')
        return generatormodel

    def setup(self):
        self.discriminator = self.build_discriminator()
        self.generator = self.build_generator()
        self.discriminator.summary() # print discriminator information
        self.generator.summary() # print generator information
        pass
    
    def compile(self, d_optimizer, g_optimizer):
        super(GAN, self).compile()
        self.d_optimizer = d_optimizer
        self.g_optimizer = g_optimizer
        pass

    def dloss_fn(self, xreal, xfake, training=False):
        D_xreal = self.discriminator(xreal, training=training)
        D_xfake = self.discriminator(xfake, training=training)
        critic_loss  = - tf.reduce_mean( tf.math.log(D_xreal) + tf.math.log(1-D_xfake) )

        discriminator_loss = critic_loss 
        return discriminator_loss

    def gloss_fn(self, xfake, training=False):
        D_xfake =  self.discriminator(xfake, training=training)
        gloss = tf.reduce_mean(tf.math.log(1-D_xfake))
        return gloss

    def train_step(self, xbatch):
        if isinstance(xbatch, tuple):
            xbatch = xbatch[0]
        # 
        batchsize = tf.shape(xbatch)[0]
        # update discriminator n times
        for icritic in range(self.ncritics):
            zbatch = tf.random.normal(shape=[batchsize, self.zdim])
            xfake = self.generator(zbatch, training=True)
            with tf.GradientTape() as dtape:
                discriminator_loss = self.dloss_fn(xbatch, xfake, training=True)

            dgrad = dtape.gradient(discriminator_loss, self.discriminator.trainable_variables)
            self.d_optimizer.apply_gradients(zip(dgrad, self.discriminator.trainable_variables))

        # update generator
        with tf.GradientTape() as gtape:
            zbatch = tf.random.normal(shape=[batchsize, self.zdim])
            xfake = self.generator(zbatch, training=True)
            generator_loss = self.gloss_fn(xfake, training=True)
        
        ggrad = gtape.gradient(generator_loss, self.generator.trainable_variables)
        self.g_optimizer.apply_gradients(zip(ggrad, self.generator.trainable_variables))

        return {'dloss': discriminator_loss, 'gloss': generator_loss}

Test with 1-D data

In [170]:
# 1-D data
data = np.random.normal(size=(10000,1)).astype(np.float32)
data = np.vstack((data, np.random.normal(loc=2.0, scale=0.5, size=(10000,1)).astype(np.float32)))

# good setting
# latentdim=2, learningrate=0.0001, epochs=100

latentdim = 1
learningrate = 0.0001
beta1 = 0.5
beta2 = 0.9


gan = GAN(1, latent_dim=latentdim, ncritics=3)
gan.compile(d_optimizer=keras.optimizers.Adam(lr=learningrate, beta_1=beta1, beta_2=beta2), g_optimizer=keras.optimizers.Adam(lr=learningrate, beta_1=beta1, beta_2=beta2))
Model: "discriminator"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 1)]               0         
_________________________________________________________________
dense (Dense)                (None, 128)               256       
_________________________________________________________________
dense_1 (Dense)              (None, 128)               16512     
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 129       
=================================================================
Total params: 16,897
Trainable params: 16,897
Non-trainable params: 0
_________________________________________________________________
Model: "generator"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 1)]               0         
_________________________________________________________________
dense_3 (Dense)              (None, 128)               256       
_________________________________________________________________
dense_4 (Dense)              (None, 128)               16512     
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 129       
=================================================================
Total params: 16,897
Trainable params: 16,897
Non-trainable params: 0
_________________________________________________________________
In [171]:
epochs = 20
hist = gan.fit(data, batch_size=128, epochs=epochs, verbose=1)
Epoch 1/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.1977 - gloss: -0.7027
Epoch 2/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3727 - gloss: -0.6426
Epoch 3/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3794 - gloss: -0.6840
Epoch 4/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3780 - gloss: -0.6835
Epoch 5/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3833 - gloss: -0.6917
Epoch 6/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3843 - gloss: -0.6925
Epoch 7/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3854 - gloss: -0.6933
Epoch 8/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3860 - gloss: -0.6928
Epoch 9/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3860 - gloss: -0.6925
Epoch 10/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3862 - gloss: -0.6937
Epoch 11/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3860 - gloss: -0.6931
Epoch 12/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3862 - gloss: -0.6930
Epoch 13/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3859 - gloss: -0.6929
Epoch 14/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3857 - gloss: -0.6933
Epoch 15/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3861 - gloss: -0.6933
Epoch 16/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3861 - gloss: -0.6933
Epoch 17/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3861 - gloss: -0.6924
Epoch 18/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3858 - gloss: -0.6927
Epoch 19/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3860 - gloss: -0.6924
Epoch 20/20
157/157 [==============================] - 0s 2ms/step - dloss: 1.3859 - gloss: -0.6935
In [172]:
# plot loss
fig, ax = plt.subplots(1, 2, figsize=(10, 6))
ax[0].plot(hist.history['dloss'])
ax[1].plot(hist.history['gloss'])
plt.show()
In [173]:
# generate and draw 
zsample = np.random.normal(size=(20000,latentdim)).astype(np.float32)
xgenerated = gan.generator(zsample)

fig, ax = plt.subplots()
ax.hist((xgenerated[:, 0], data[:, 0]), bins=50, range=(-4.0, 4.0), density=True)
plt.show()

With Z data

In [481]:
data = np.vstack(([muE[twomusel, 0], mupx[twomusel,0], mupy[twomusel,0], mupz[twomusel,0], muE[twomusel, 1], mupx[twomusel,1], mupy[twomusel,1], mupz[twomusel,1]])).T

# good setting
#latentdim = 16; learningrate = 0.0001; beta1 = 0.5; beta2 = 0.9; epochs = 1000; batch_size = 256
latentdim = 16; learningrate = 0.0001; beta1 = 0.5; beta2 = 0.9; epochs = 1000; batch_size = 256


gan = GAN(8, latent_dim=latentdim)
gan.compile(d_optimizer=keras.optimizers.Adam(lr=learningrate, beta_1=beta1, beta_2=beta2), g_optimizer=keras.optimizers.Adam(lr=learningrate, beta_1=beta1, beta_2=beta2))
Model: "discriminator"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_67 (InputLayer)        [(None, 8)]               0         
_________________________________________________________________
dense_198 (Dense)            (None, 128)               1152      
_________________________________________________________________
dense_199 (Dense)            (None, 128)               16512     
_________________________________________________________________
dense_200 (Dense)            (None, 1)                 129       
=================================================================
Total params: 17,793
Trainable params: 17,793
Non-trainable params: 0
_________________________________________________________________
Model: "generator"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_68 (InputLayer)        [(None, 16)]              0         
_________________________________________________________________
dense_201 (Dense)            (None, 128)               2176      
_________________________________________________________________
dense_202 (Dense)            (None, 128)               16512     
_________________________________________________________________
dense_203 (Dense)            (None, 8)                 1032      
=================================================================
Total params: 19,720
Trainable params: 19,720
Non-trainable params: 0
_________________________________________________________________
In [482]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

data_normed = scaler.fit_transform(data)
In [483]:
hist = gan.fit(data_normed, batch_size=batch_size, epochs=epochs, verbose=0)
In [484]:
# plot loss
fig, ax = plt.subplots(1, 2, figsize=(10, 6))
ax[0].plot(hist.history['dloss'])
ax[1].plot(hist.history['gloss'])
plt.show()
In [485]:
zsample = np.random.normal(size=(20000,16)).astype(np.float32)
xgenerated = gan.generator(zsample)
xgenerated = (xgenerated * np.sqrt(scaler.var_[0:8])) + scaler.mean_[0:8]
In [486]:
xgenerated
Out[486]:
<tf.Tensor: shape=(20000, 8), dtype=float32, numpy=
array([[190.80827  , -16.69476  ,  21.65696  , ...,  16.636204 ,
        -18.958014 ,   7.4305205],
       [ 66.24086  , -43.788376 ,  16.52717  , ...,  38.773384 ,
         -7.0946217,  66.48626  ],
       [ 27.775307 , -20.866241 , -15.936575 , ...,  12.606598 ,
         15.197504 , -93.53331  ],
       ...,
       [ 46.918903 ,  -3.32558  , -43.105198 , ...,   2.9248567,
         52.326283 , -38.218704 ],
       [ 84.48567  , -18.177925 ,  21.301216 , ...,  26.091808 ,
         -6.2156143, -46.277905 ],
       [ 38.623447 ,  27.680021 , -24.716034 , ..., -17.578304 ,
         22.220526 , -63.575813 ]], dtype=float32)>
In [508]:
# generate data
fig,ax = plt.subplots(2, 4, figsize=(20,12))
ax[0, 0].hist((xgenerated[:,0], muE[twomusel,0]), bins=50, density=True)
ax[0, 1].hist((xgenerated[:,1], mupx[twomusel,0]), bins=50, density=True)
ax[0, 2].hist((xgenerated[:,2], mupy[twomusel,0]), bins=50, density=True)
ax[0, 3].hist((xgenerated[:,3], mupz[twomusel,0]), bins=50, density=True)
ax[1, 0].hist((xgenerated[:,4], muE[twomusel,1]), bins=50, density=True)
ax[1, 1].hist((xgenerated[:,5], mupx[twomusel,1]), bins=50, density=True)
ax[1, 2].hist((xgenerated[:,6], mupy[twomusel,1]), bins=50, density=True)
ax[1, 3].hist((xgenerated[:,7], mupz[twomusel,1]), bins=50, density=True)
plt.show()
fig.savefig('figures/gan_data.png', dpi = 300)
In [ ]:
 
In [488]:
twomumass_gan = invmass(xgenerated[:,0], xgenerated[:,1], xgenerated[:,2], xgenerated[:,3], xgenerated[:,4], xgenerated[:,5], xgenerated[:,6], xgenerated[:,7])
/home/yhs07128/.local/lib/python3.6/site-packages/ipykernel_launcher.py:7: RuntimeWarning: invalid value encountered in sqrt
  import sys
In [511]:
# draw dimuon invariant mass
fig, ax = plt.subplots(figsize=(6,6))
ax.hist(twomumass_gan, bins=100, range=(30, 150))
plt.title('Dimuon invariant mass')
plt.xlabel('Mass (GeV)')
plt.ylabel('Events')
plt.show()
fig.savefig('figures/dimuon_gan.png', dpi=300)
/home/yhs07128/.local/lib/python3.6/site-packages/numpy/lib/histograms.py:839: RuntimeWarning: invalid value encountered in greater_equal
  keep = (tmp_a >= first_edge)
/home/yhs07128/.local/lib/python3.6/site-packages/numpy/lib/histograms.py:840: RuntimeWarning: invalid value encountered in less_equal
  keep &= (tmp_a <= last_edge)
In [490]:
# define histogram
masshist_gan = ax.hist(twomumass_gan, bins=100, range=(70, 100))
In [491]:
# smoothen histogram
masssmooth_gan = np.array([savgol_filter(masshist_gan[0], 15, 3), masshist_gan[1][:-1]])
In [492]:
# find peak = zmass
zmass_gan = masssmooth_gan[1][masssmooth_gan[0].argmax()]
In [493]:
zmass_gan #91.19
Out[493]:
91.5999984741211
In [494]:
zwidth_gan = half_max(masssmooth_gan)
In [521]:
print(zwidth_gan) #2.50
print(zwidth_gan[1] - zwidth_gan[0])
[85.8470346436661, 97.79479867072114]
11.94776402705503
In [520]:
# draw dimuon invariant mass
fig, ax = plt.subplots(figsize=(6,6))
ax.hist(twomumass_gan, bins=100, range=(70, 100))
ax.plot(masssmooth_gan[1], masssmooth_gan[0], color='0.3')
plt.title('Dimuon invariant mass_GAN')
plt.xlabel('Mass (GeV)')
plt.ylabel('Events')
plt.show()
fig.savefig('figures/dimuon_gan_70to100.png', dpi=300)
/home/yhs07128/.local/lib/python3.6/site-packages/numpy/lib/histograms.py:839: RuntimeWarning: invalid value encountered in greater_equal
  keep = (tmp_a >= first_edge)
/home/yhs07128/.local/lib/python3.6/site-packages/numpy/lib/histograms.py:840: RuntimeWarning: invalid value encountered in less_equal
  keep &= (tmp_a <= last_edge)
In [519]:
# draw dimuon invariant mass
fig, ax = plt.subplots(figsize=(6,6))
ax.hist((twomumass_gan, twomumass), bins=100, range=(70, 100))
plt.title('Dimuon invariant mass')
plt.xlabel('Mass (GeV)')
plt.ylabel('Events')
plt.show()
fig.savefig('figures/dimuon_comparison.png', dpi=300)
In [ ]: