Using the MNIST dataset (70 000 pictures of hand-written digits) we will train a simple CNN, which is able to predict a digit given a picture of a hand-written digit.
importnumpyasnpimporttorchimporttorch.nnasnnimporttorch.nn.functionalasFimporttorch.optimasoptimfromtorchvisionimportdatasets,transformsfromtorch.utils.dataimportDataLoaderimportmatplotlib.pyplotasplt# Set random seeds for reproducibilitynp.random.seed(1337)torch.manual_seed(1337)# Set devicedevice=torch.device("cuda"iftorch.cuda.is_available()else"cpu")print(f"Using device: {device}")
Using TensorFlow backend.
Network parameters:
InĀ [Ā ]:
batch_size=128nb_classes=10nb_epoch=12# input image dimensionsimg_rows,img_cols=28,28# number of convolutional filters to usenb_filters=32# size of pooling area for max poolingpool_size=2# convolution kernel sizekernel_size=3# Learning ratelearning_rate=1.0
Prepare data into training and test set.
InĀ [Ā ]:
# Define transformstransform=transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.1307,),(0.3081,))])# Load MNIST datasettrain_dataset=datasets.MNIST(root='./data',train=True,download=True,transform=transform)test_dataset=datasets.MNIST(root='./data',train=False,download=True,transform=transform)# Create data loaderstrain_loader=DataLoader(train_dataset,batch_size=batch_size,shuffle=True)test_loader=DataLoader(test_dataset,batch_size=batch_size,shuffle=False)# For visualization, load the raw data(X_train,y_train),(X_test,y_test)=train_dataset.data.numpy(),train_dataset.targets.numpy(),test_dataset.data.numpy(),test_dataset.targets.numpy()X_train,y_train=train_dataset.data.numpy(),train_dataset.targets.numpy()X_test,y_test=test_dataset.data.numpy(),test_dataset.targets.numpy()
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 1s 0us/step
# PyTorch handles reshaping automatically through transforms# Data shape infoprint('X_train shape:',X_train.shape)print(f'{len(train_dataset)} train samples')print(f'{len(test_dataset)} test samples')
# Define the CNN modelclassSimpleCNN(nn.Module):def__init__(self):super(SimpleCNN,self).__init__()# First convolutional layer: 1 input channel, 32 output channels, 3x3 kernelself.conv1=nn.Conv2d(1,nb_filters,kernel_size=kernel_size,padding='same')# Second convolutional layer: 32 input channels, 32 output channels, 3x3 kernelself.conv2=nn.Conv2d(nb_filters,nb_filters,kernel_size=kernel_size)# Max pooling layerself.pool=nn.MaxPool2d(pool_size,pool_size)# Dropout layerself.dropout1=nn.Dropout(0.25)self.dropout2=nn.Dropout(0.5)# Calculate the size after conv and pooling layers# After conv1 (padding='same'): 28x28# After conv2 (no padding): 26x26# After pooling (2x2): 13x13self.fc1=nn.Linear(nb_filters*13*13,128)self.fc2=nn.Linear(128,nb_classes)defforward(self,x):# First conv blockx=F.relu(self.conv1(x))x=F.relu(self.conv2(x))x=self.pool(x)x=self.dropout1(x)# Flattenx=x.view(-1,nb_filters*13*13)# Fully connected layersx=F.relu(self.fc1(x))x=self.dropout2(x)x=self.fc2(x)returnF.log_softmax(x,dim=1)# Create model instancemodel=SimpleCNN().to(device)# Define optimizer (Adadelta is similar to Keras default)optimizer=optim.Adadelta(model.parameters(),lr=learning_rate)# Define loss functioncriterion=nn.CrossEntropyLoss()
Show a summary of the model parameters.
InĀ [Ā ]:
# Display model summaryprint(model)print("\nTotal parameters:",sum(p.numel()forpinmodel.parameters()))print("Trainable parameters:",sum(p.numel()forpinmodel.parameters()ifp.requires_grad))