Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

18c build failing at TASK [oradb-manage-db : manage-db | create/manage database] -[FATAL] ORA-03113: end-of-file on communication channel #5

Open
brokedba opened this issue Sep 13, 2019 · 12 comments

Comments

@brokedba
Copy link

Hi Mikael,
I have been trying your build for several times now and it still fails at the same stage .
I took the liberty to change few attributes ( ips ,hostnames,scan name,dbnames, db_home path ) in the corresponding VagrantFile and host.yml.

It seems that the failure kicks in just before or during PDB creation. As all the prior steps are processed until this task : > [oradb-manage-db : manage-db | create/manage database]
dbca creates the db but couldn't bounce the instances .(previous run had successfully installed dbs but stopped at the PDB creation)
here is the output I got at the last try :

TASK [oradb-manage-db : manage-db | create/manage database] ********************
fatal: [london1]: FAILED! => {"changed": false, "msg": "Error - STDOUT: [WARNING] [DBT-09102] Target environment does not meet some optional requirements.\n CAUSE: Some of the optional prerequisites are not met. See logs for details.\n
ACTION: Find the appropriate configuration from the log file or from the installation guide to meet the prerequisites and fix this manually.\nPrepare for db operation\n8% complete\nCopying database files\n33% complete\nCreating and st
arting Oracle instance\n34% complete\n35% complete\n39% complete\n[FATAL] ORA-03113: end-of-file on communication channel\n\n50% complete\n100% complete\n[FATAL] ORA-03113: end-of-file on communication channel\n\n33% complete\n8% complet
e\n0% complete\nLook at the log file "/u01/app/oracle/cfgtoollogs/dbca/racdb/racdb.log" for further details.\n, STDERR: , COMMAND: /u01/app/oracle/product/18.3.0.0/db1/bin/dbca -createDatabase -silent -responseFile /u01/stage/rsp/dbca
_racdb.rsp -initParams db_create_file_dest=+DATA,db_create_online_log_dest_1=+FRA,db_recovery_file_dest=+FRA,db_recovery_file_dest_size=20G"}

NO MORE HOSTS LEFT *************************************************************
to retry, use: --limit @/vagrant/extra-provision/ansible-oracle/vbox-rac-dc1.retry
PLAY RECAP *********************************************************************
london1 : ok=116 changed=64 unreachable=0 failed=1
london2 : ok=95 changed=53 unreachable=0 failed=0

Ansible failed to complete successfully. Any error output should be visible above. Please fix these errors and try again.

when I check the crs status the instances are there but just not up and runing.

Instance racdb1 is not running on node london1
Instance racdb2 is not running on node london2

any idea what could be the problem here ? . I can show the host.yml and vagrant if you want .
Thanks again

@brokedba
Copy link
Author

I tried another time and now it stops at the manage tablespace Task

TASK [oradb-manage-pdb : Manage pdb(s)] ****************************************

TASK [oradb-manage-tablespace : Manage tablespaces (db/cdb)] *******************

TASK [oradb-manage-tablespace : Manage tablespaces (pdb)] **********************
failed: [london1] (item=port: 1521 service: PDB tablespace: users content: permanent state: present) => {"changed": false, "item": [{"cdb": "rac_db", "home": "18300-base", "init_parameters": [{"name": "db_create_file_dest", "scope": "both", "state": "present", "value": "+DATA"}], "pdb_name": "PDB", "roles": [{"grants": ["create session", "create table", "select any table", "select any dictionary"], "name": "approle1", "state": "present"}], "services": [{"ai": "racdb2", "name": "app1_service", "pi": "racdb1", "state": "started"}], "state": "present", "users": [{"default_tablespace": "appuser1_data", "grants": ["approle1"], "schema": "appuser1", "state": "present"}]}, {"autoextend": false, "bigfile": true, "content": "permanent", "maxsize": "500M", "name": "users", "next": "5M", "size": "10M", "state": "present"}], "msg": "Could not connect to database - ORA-12514: TNS:listener does not currently know of service requested in connect descriptor, connect descriptor: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=london1)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=PDB)))"}
failed: [london1] (item=port: 1521 service: PDB tablespace: appuser1_data content: permanent state: present) => {"changed": false, "item": [{"cdb": "rac_db", "home": "18300-base", "init_parameters": [{"name": "db_create_file_dest", "scope": "both", "state": "present", "value": "+DATA"}], "pdb_name": "PDB", "roles": [{"grants": ["create session", "create table", "select any table", "select any dictionary"], "name": "approle1", "state": "present"}], "services": [{"ai": "racdb2", "name": "app1_service", "pi": "racdb1", "state": "started"}], "state": "present", "users": [{"default_tablespace": "appuser1_data", "grants": ["approle1"], "schema": "appuser1", "state": "present"}]}, {"autoextend": false, "bigfile": true, "content": "permanent", "maxsize": "500M", "name": "appuser1_data", "next": "5M", "size": "10M", "state": "present"}], "msg": "Could not connect to database - ORA-12514: TNS:listener does not currently know of service requested in connect descriptor, connect descriptor: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=london1)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=PDB)))"}

NO MORE HOSTS LEFT *************************************************************

@brokedba
Copy link
Author

brokedba commented Sep 13, 2019

[oracle@racdb1 admin]$tnsping racdb is successful on both nodes though

Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = london-cluster-scan.evilcorp.com)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = racdb)))
OK (40 msec) 

[oracle@racdb2 oracle]$srvctl status database -db racdb

Instance racdb1 is running on node london1
Instance racdb2 is running on node london2

[oracle@racdb2 oracle]$sqlplus / as sysdba
Connected to:Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
SYS @ RACDB2:CDB$ROOT:>SELECT NAME, CON_ID, DBID, CON_UID, GUID FROM V$CONTAINERS ORDER BY CON_ID;

NAME                CON_ID       DBID    CON_UID GUID
--------------- ---------- ---------- ---------- --------------------------------
CDB$ROOT                 1 1016529875          1 64A52F53A7683286E053CDA9E80AED76
PDB$SEED                 2 1898853373 1898853373 9276EB713D994B9FE053334EA8C0EAF4

I don't get why ansible play stopped

@oravirt
Copy link
Owner

oravirt commented Sep 16, 2019

Hi, sorry about the late answer.
Yes, please show me your hosts.yml, and then I'll see if I can re-produce this

@brokedba
Copy link
Author

Hi, No problem Mikael. here is the hosts.yml file

 - basename_vm: london
  num_vm: 2
  hostgroup: vbox-rac-dc1
  domain: evilcorp.com
  box: oravirt/rhel75
  vagrant_user: vagrant
  vagrant_pass: vagrant
  #vagrant_private_key: /vagrant/base-provision/insecure_private_key
  ram: 6196
  cpu: 1
  base_pub_ip: 192.168.78.51
  base_pub_ip_vip: 192.168.78.61
  scan_addresses: 192.168.78.251,192.168.78.252,192.168.78.253
  base_priv_ip: 172.16.100.51
  scan_name: "london-cluster-scan"
  synced_folders:
     #- {src: swrepo, dest: /media/swrepo}
     #- {src: /Users/miksan/Downloads/oracle, dest: /media/swrepo}
     - {src: "D:\\VM\\vagrant\\software", dest: /media/swrepo}
  base_disk_path:
  create_local_disk: true
  local_disks:
     - {name: u01, size: 75, count: 1}
  create_shared_disk: true
  shared_disks:
     - {name: crs, size: 40, count: 1}
     - {name: data, size: 8, count: 1}
     - {name: fra, size: 12, count: 2}
  provisioning: extra-provision/ansible-oracle/vbox-rac-dc1.yml
  provisioning_env_override: true

and here is the Vagrantfile (added scan_name and oracle_home vars)
https://bit.ly/2kCXGna

I think I know where the problem comes from . here is the content of oravirt\extra-provision\ansible-oracle\group_vars\vbox-rac-dc1\databases.yml

 oracle_databases:
      - home: 18300-base
        oracle_db_name: rac_db
        oracle_db_type: RAC
        is_container: True
        storage_type: ASM
        oracle_db_mem_totalmb: 1024
        oracle_database_type: MULTIPURPOSE
        redolog_size: 75M
        redolog_groups: 3
        datafile_dest: '+DATA'
        recoveryfile_dest: '+FRA'
        archivelog: True
        flashback: False
        force_logging: False
        state: present
        # tablespaces:
             # - { name: users, size: 10M, bigfile: True, autoextend: false , next: 5M, maxsize: 500M, content: permanent, state: present }
        init_parameters:
             - {name: db_create_file_dest, value: '+DATA', scope: both, state: present}
             - {name: db_create_online_log_dest_1, value: '+FRA', scope: both, state: present}
             - {name: db_recovery_file_dest, value: '+FRA', scope: both, state: present}
             - {name: db_recovery_file_dest_size, value: 20G, scope: both, state: present}

oracle database dbca creation uses the variable oracle_db_name in my case rac_db but it seems to ignore the underscore durring creation (name is now racdb) . So I guess the dbca create pdb may look for a db that doesn't exists (rac_db ) .
I am going to retry that with racdb and let you know " btw i dont understand why you split pdb and db creation in 2 tasks.
Thx

@brokedba
Copy link
Author

no it didn't fix it.

@oravirt
Copy link
Owner

oravirt commented Sep 19, 2019

Thanks!
I haven't had time to look at this yet, sorry about that.

btw i dont understand why you split pdb and db creation in 2 tasks.

I may already have a CDB and only want to create a PDB and then it doesn't make sense to run the DB provisioning step.
The PDB lifecycle may be totally different to the CDB, so breaking up the management of them makes sense.

@brokedba
Copy link
Author

Hi Mikaël,
I finally found the root cause of the issue. it was due to memory starvation . I don't know if you have already run this build with that low amount of ram but I had to increase the ram in host.yml to

ram: 7168

the database creation was hanging because there was too much swapping in node 1.

               Cluster vbox-rac-dc1

      Listener   |      Port      |      london1      |      london2      |     Type     |
  ---------------------------------------------------------------------------------------
   LISTENER      | TCP:1521       |       Online      |       Online      |   Listener   |
   LISTENER_SCAN1| TCP:1521       |         -         |       Online      |     SCAN     |
   LISTENER_SCAN2| TCP:1521       |       Online      |         -         |     SCAN     |
   LISTENER_SCAN3| TCP:1521       |       Online      |         -         |     SCAN     |
  ---------------------------------------------------------------------------------------

         DB      |     Version    |      london1      |      london2      |    DB Type   |
  ---------------------------------------------------------------------------------------
   racdb         | 18.3.0.0   (1) |        Open       |        Open       |    RAC (P)   |
  ---------------------------------------------------------------------------------------
  ORACLE_HOME references listed in the Version column

         1 : /u01/app/oracle/product/18.3.0.0/db1       oracle oinstall

       : Has been restarted less than 24 hours ago

@brokedba
Copy link
Author

you said ,

I may already have a CDB and only want to create a PDB. and then it doesn't make sense to run the DB provisioning step.

how can that be if you are creating the build from scratch . if I read your yaml well I see manage_db role called first then quite later manage-pdb . so at worse we can introduce a boulean var that decides on creating a db with pdb or not and call dbca once with one template responsefile.

extra-provision/ansible-oracle/vbox-rac-dc1.yml

  • name: Database Server Installation & Database Creation
    hosts: vbox-rac-dc1
    user: vagrant
    become: yes
    roles:

    • oraswdb-install
    • {role: oraswdb-manage-patches, when: apply_patches_db}
      - oradb-manage-db
  • name: Configure Logrotate
    hosts: vbox-rac-dc1
    user: vagrant
    become: yes
    roles:

    • orahost-logrotate
  • name: Customize database
    hosts: vbox-rac-dc1
    user: vagrant
    sudo: yes
    roles:
    - oradb-manage-pdb

    • oradb-manage-tablespace
    • oradb-manage-parameters
    • oradb-manage-redo
    • oradb-manage-roles
    • oradb-manage-users
    • oradb-manage-grants
    • oradb-manage-services

@oravirt
Copy link
Owner

oravirt commented Sep 21, 2019

I finally found the root cause of the issue. it was due to memory starvation

Great!

I don't know if you have already run this build with that low amount of ram

Yes, I have but it was slow

how can that be if you are creating the build from scratch

Yes, for this specific Vagrant project it builds everything from scratch, but the toolkit (ansible-oracle) is it's own project which can be run completely stand-alone and is just used as the provisioning step for this project
So, your're welcome to fork the project and customize it in whatever way you want (or start your own project) but this is the way it is treated for this project.
I don't really understand why you think it is a problem to have this in 2 separate tasks?

I try to write my Ansible roles such that they do 1 thing. So I think that bundling the pdb creation with the create (C)DB step is bad idea.

@brokedba
Copy link
Author

Hi Mikael,
Sorry I didn't mean to criticize your work . As a matter of fact I find the build amazing . I was just curious about the rac provisioning tasks and rules used in your build . I reckon now that ansible-oracle toolkit has a larger scope than just the rac build. I didn't say it was a bad idea or a problem I just wanted to know more about the context of your ansible provisioning .
Being new in ansible I had to read all the extra provisioning /group_vars roles involved in the playbook one by one to know what was actually happening. I even sent you emails before so I could understand it better .
I am interested to add rhel8 to the lists of boxes for this build (in a fork for example).
By the way how can we change the cluster name without messing up the hostgroup variable ?
I also wanted to know where does the oracle user password is defined.
Thank you again and sorry if my message seemed demanding .

@yevp
Copy link

yevp commented Jan 1, 2020

Hi @KoussHD ,
Hash from default password is defined there:
https://github.com/oravirt/ansible-oracle/blob/master/roles/orahost/defaults/main.yml
relative path inside vagrant-vbox-rac ./extra-provision/ansible-oracle/roles/orahost/defaults/main.yml

  oracle_users:         # Passwd :Oracle123
   - { username: oracle, uid: 54321, primgroup: oinstall, othergroups: "dba,asmadmin,asmdba,backupdba,dgdba,kmdba,oper", passwd: "$6$0xHoAXXF$K75HKb64Hcb/CEcr3YEj2LGERi/U2moJgsCK.ztGxLsKoaXc4UBiNZPL0hlxB5ng6GL.gyipfQOOXplzcdgvD0" }

you could generate hash and override default with variable-precedence

$python -c 'import crypt; print crypt.crypt("Oracle123", "$6$0xHoAXXF$")'
$6$0xHoAXXF$K75HKb64Hcb/CEcr3YEj2LGERi/U2moJgsCK.ztGxLsKoaXc4UBiNZPL0hlxB5ng6GL.gyipfQOOXplzcdgvD0

@brokedba
Copy link
Author

brokedba commented Jan 7, 2020

thank you @yevp . It has been a while since I ran the build and deleted it few days later . But I'll give a try this weekend .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants