OpenShift on OpenStack – no-brainer on-prem solution

I have to admit, it has taken me a while to produce this next article. Today, however it’s about to change and I am happy to introduce on this blog OpenShift on OpenStack – no-brain on-prem solution for building apps. What makes this combination so special? Here is the short list of the top integration features:

– Fully Automated IPI experience

– Ability to deploy on mix of VMs and BM nodes

– With true multitenancy for VMs, BMs, Storage, Networking

– Dynamic , multi-tier storage as a service

– Best performance (No double network encapsulation, baremetal workers)

Disclaimer: I am not authorized to speak on behalf of Red Hat, are sponsored by Red Hat or are representing Red Hat and its views. The views and opinions expressed on this website are entirely my own.

Check out my video first before jumping into configuration down below.

I. OpenStack configuration

1. deploy.sh

My environment is currently configured with RHOSP13.0.11 with following features enabled:

– network isolation

– ceph integration

– ceph RGW

– manila with ganesha

– self signed ssl

– ironic + inspector

– octavia 

– ldap

– net-ansible for BM multitenancy

(undercloud) [stack@undercloud-osp13 ~]$ cat deploy.sh 
#!/bin/bash
source ~/stackrc
cd ~/
time openstack overcloud deploy  –templates –stack chrisj-osp13 \
  -r /home/stack/templates/roles_data.yaml \
  -n /home/stack/templates/network_data.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-rgw.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/manila-cephfsganesha-config.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/services/ceph-mds.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-dns.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/ironic.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/ironic-inspector.yaml \
  -e /usr/share/openstack-tripleo-heat-templates/environments/services-docker/octavia.yaml \
  -e /home/stack/templates/network-environment.yaml \
  -e /home/stack/templates/ceph-custom-config.yaml \
  -e /home/stack/templates/enable-tls.yaml \
  -e /home/stack/templates/enable-ldap.yaml \
  -e /home/stack/templates/ExtraConfig.yaml \
  -e /home/stack/templates/inject-trust-anchor.yaml \
  -e /home/stack/templates/inject-trust-anchor-hiera.yaml \
  -e /home/stack/templates/overcloud_images.yaml 
 

2. network_data.yaml custom networks:

– name: CustomBM
  name_lower: custombm
  vip: true
  ip_subnet: ‘172.31.10.0/24’
  allocation_pools: [{‘start’: ‘172.31.10.20’, ‘end’: ‘172.31.10.49’}]
  vlan: 320

– name: StorageNFS
  name_lower: storage_nfs
  enable: true
  vip: true
  vlan: 315
  ip_subnet: ‘172.31.5.0/24’
  allocation_pools: [{‘start’: ‘172.31.5.20’, ‘end’: ‘172.31.5.29’}]
 

The extra network have been created to handle baremetal cleaning and provisioning and direct access to Manila NFS storage

3. roles_data.yaml

My roles data has changed mostly for controller by adding the 2 networks for manila and baremetal as well as 2 additional services:

###############################################################################
# Role: Controller                                                            #
###############################################################################
– name: Controller
  description: |

  networks:
    – External
    – InternalApi
    – Storage
    – StorageMgmt
    – Tenant
    – CustomBM
    – StorageNFS

  ServicesDefault:
….

    – OS::TripleO::Services::CephNfs

….
    – OS::TripleO::Services::IronicInspector
…..

4. Nic config controller.yaml

                – type: vlan
                  mtu: 1500
                  vlan_id:
                    get_param: CustomBMNetworkVlanID
                  addresses:
                  – ip_netmask:
                      get_param: CustomBMIpSubnet
                – type: vlan
                  mtu: 1500
                  vlan_id:
                    get_param: StorageNFSNetworkVlanID
                  addresses:
                  – ip_netmask:
                      get_param: StorageNFSIpSubnet

5. network-environment.yaml

In order to enable net-ansible I have added the folloing:

resource_registry:
  OS::TripleO::Services::NeutronCorePlugin: OS::TripleO::Services::NeutronCorePluginML2Ansible

parameter_defaults:
  NeutronMechanismDrivers: openvswitch,ansible
  NeutronTypeDrivers: local,vlan,flat,vxlan
  ML2HostConfigs:
    ex3400:
      ansible_host: ‘172.31.8.254’
      ansible_network_os: ‘junos’
      ansible_ssh_pass: ‘XXXXXXX’
      ansible_user: ‘ansible’
      manage_vlans: ‘false’
      mac: ’84:c1:c1:48:06:1b’

6. Extra-config.yaml (mostly for ironic and inspection)

  NovaSchedulerDefaultFilters:
    – RetryFilter
    – AggregateInstanceExtraSpecsFilter
    – AggregateMultiTenancyIsolation
    – AvailabilityZoneFilter
    – RamFilter
    – DiskFilter
    – ComputeFilter
    – ComputeCapabilitiesFilter
    – ImagePropertiesFilter
    – ServerGroupAntiAffinityFilter
    – ServerGroupAffinityFilter

  IronicCleaningDiskErase: metadata
  IronicIPXEEnabled: true
  IronicCleaningNetwork: baremetal
  IronicProvisioningNetwork: baremetal

  IronicInspectorIpRange: ‘172.31.10.50,172.31.10.69’
  IronicInspectorInterface: vlan320
  IronicInspectorEnableNodeDiscovery: true
  IronicInspectorCollectors: default,extra-hardware,numa-topology,logs

  ServiceNetMap:
    IronicApiNetwork: custombm
    IronicNetwork: custombm
    IronicInspectorNetwork: custombm
    IronicProvisioningNetwork: baremetal

  ControllerExtraConfig:
    ironic::inspector::add_ports: all
 

II. OpenShift Configuration

Since I wanted to take advantage of features that are not yet in GA version of the product I have decided to use the latest (at the time of deploying) nightly version of developer preview, which happened to be:

1. openshift-install version
openshift-install 4.5.0-0.nightly-2020-06-20-194346
built from commit 2874fb3204089822b561f98a0a3fe7b15a84da00
release image registry.svc.ci.openshift.org/ocp/release@sha256:749ebc496202f1e749f84c65ea8e16091c546528f0a37849bbc6340d1441fbe1

2. install-config

apiVersion: v1
baseDomain: openshift.lab
compute:
– architecture: amd64
  hyperthreading: Enabled
  name: worker
  platform:
    openstack:
      additionalNetworkIDs:
      – 26531a24-0959-4cdb-8d8e-61a711b5847c
      type: baremetal

  replicas: 2
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  platform:
    openstack:
      additionalNetworkIDs:
      – 26531a24-0959-4cdb-8d8e-61a711b5847c

  replicas: 3
metadata:
  creationTimestamp: null
  name: production
networking:
  clusterNetwork:
  – cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  – cidr: 192.168.100.0/24

  #- cidr: 10.0.0.0/16
  networkType: OpenShiftSDN
  serviceNetwork:
  – 172.30.0.0/16
platform:
  openstack:
    cloud: openstack
    computeFlavor: m1.large
    externalNetwork: external
    lbFloatingIP: 172.31.8.151
    octaviaSupport: “1”
    region: “”
    trunkSupport: “1”
    machinesSubnet: e7b40924-afd1-4c35-85c5-f0da36b3272c
    clusterOSImage: rhcos45-raw
    apiVIP: 192.168.100.200
    ingressVIP: 192.168.100.201

publish: External
pullSecret:…
sshKey: |
  ssh-rsa AAAAB …

Above I am highlighting a few features that are not going to be there out of the box.

– additionalNetworkIDs – allows for attaching more then single network to instances. In my case I have pointed to UID for the StorageNFS network

– type: baremetal – allows to specify an openstack flavor (baremetal) to be used for each of the roles

– machineNetwork / machineSubnet – this is also a new feature that allows for Bring-your-own-network (BYON) functionality to OCP IPI installer. In my case I have pre-created a Tenant vlan network and asked OCP to use it. The CIDR needs to match as well

– apiVIP/ingressVIP – is also there for BYON functionality and allows to manually pick IPs used for these 2 functions

– clusterOSImage – allows to specify the glance pre-uploaded CoreOS image (in my case I have been using different images) 

III. Ansible

The Ansible Tower integration for zero-touch provisioning is rather simpleand involves a single playbook that should be added to you Tower schedule. I have shared the source code in here:

https://github.com/OOsemka/AnsibleTower-IronicTools

VI. Post Deploy tasks

1. Manila CSI driver integration

Thanks to Mike Fedosin (main developer for Manila CSI integration) the process of adding Manila driver to OCP is really simple. To make sure you have the lastest instructions please follow his github readme page

https://github.com/openshift/csi-driver-manila-operator

Please note that for OCP deployment with baremetal nodes Manila is truly the “only game in town” from OpenStack perspective. Cinder integration is there but only for worker running as VMs.

2. Swift and Self Sign Certs workaround.

Unfortunately we still have a bug in case you end up using Self Sign Certs like myself. Te bug itself can be found in here:

https://bugzilla.redhat.com/show_bug.cgi?id=1810461

The workaround is rather simple.

After authenticating as kubeadmin to your deployed OCP cluster execute the following:

oc edit configs.imageregistry.operator.openshift.io/cluster

then find spec section and add to it:

disableRedirect: true

3. Timeout not long enough for Baremetal nodes initial deployment

There is a large amount of servers that happen to not bot quickly enough for OCP installer to accomodate for it. You will see a deployment failing in these cases. It is a bug that has been captured in here:

https://bugzilla.redhat.com/show_bug.cgi?id=1843979

The workaround is also quite simple. After the initial failure just execute something like this (Example):

openshift-install –dir=<insert-your-install-directory-here> wait-for install-complete

This will allow you to see the finish line.

Please note this only happens with the initial deploy, the scaleing out should not have a similar problem.

Big thanks to Paul Halmos, Mike Fedosin, Martin Andre and entire shiftstack team!

Leave a Reply

Your email address will not be published. Required fields are marked *