Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Elasticsearch 7.0 and Guillotina 6 #66

Merged
merged 77 commits into from
Mar 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
7f1d520
handle index not found error
vangheem Jul 8, 2019
f0124fd
Preparing release 3.3.2
vangheem Jul 8, 2019
d402ee5
Back to development: 3.3.3
vangheem Jul 8, 2019
4049888
logging
vangheem Jul 8, 2019
4b84b1c
Preparing release 3.3.3
vangheem Jul 8, 2019
1e181ad
Back to development: 3.3.4
vangheem Jul 8, 2019
540ddb1
Handle another index not found error on vacuum
vangheem Jul 9, 2019
ea1830c
Preparing release 3.3.4
vangheem Jul 9, 2019
1312559
Back to development: 3.3.5
vangheem Jul 9, 2019
a896d6e
do not close indexes on create/delete
vangheem Aug 7, 2019
11bff76
Preparing release 3.3.5
vangheem Aug 7, 2019
0e8f5cb
Back to development: 3.3.6
vangheem Aug 7, 2019
093312a
g5 support for 3.3
vangheem Aug 27, 2019
c5d5e45
bump
vangheem Aug 29, 2019
9d05c05
Preparing release 3.3.10
vangheem Aug 29, 2019
e36d759
Back to development: 3.3.11
vangheem Aug 29, 2019
08f004e
do not require request object
vangheem Sep 3, 2019
d1ca818
Preparing release 3.3.11
vangheem Sep 3, 2019
647861a
Back to development: 3.3.12
vangheem Sep 3, 2019
ae0e489
fix release
vangheem Sep 3, 2019
0145efe
Preparing release 3.3.12
vangheem Sep 3, 2019
eaa3d70
Back to development: 3.3.13
vangheem Sep 3, 2019
d67144b
Pass request on the index progress when possible
vangheem Sep 3, 2019
8dcd1c6
Preparing release 3.3.13
vangheem Sep 3, 2019
759d383
Back to development: 3.3.14
vangheem Sep 3, 2019
abe386b
Missing pg conn lock with vacuuming
vangheem Sep 6, 2019
46759e9
Preparing release 3.3.14
vangheem Sep 6, 2019
b7d0d69
Back to development: 3.3.15
vangheem Sep 6, 2019
e5e755a
fix release
vangheem Sep 6, 2019
60eeb49
Preparing release 3.3.15
vangheem Sep 6, 2019
c0602cb
Back to development: 3.3.16
vangheem Sep 6, 2019
7d7a230
[3.3.x] Changed 'query.bool.filter' to use a list instead of a si… (#60)
masipcat Sep 10, 2019
465aef0
Preparing release 3.3.16
vangheem Sep 10, 2019
24eb134
Back to development: 3.3.17
vangheem Sep 10, 2019
378554e
Fix not iterating over all content indexes in elasticsearch
vangheem Sep 18, 2019
1a41cec
Preparing release 3.3.17
vangheem Sep 18, 2019
1395200
Back to development: 3.3.18
vangheem Sep 18, 2019
9e52cc3
ISecurityInfo can be async
vangheem Sep 27, 2019
1d4292b
Preparing release 3.3.18
vangheem Sep 27, 2019
edc6081
Back to development: 3.3.19
vangheem Sep 27, 2019
8aa453a
Fix commands using missing attribute `self.request` (#61)
masipcat Sep 29, 2019
a559fc5
Pay attention to trashed objects in pg
vangheem Oct 18, 2019
b03556f
Preparing release 3.3.19
vangheem Oct 18, 2019
072dd3d
Back to development: 3.3.20
vangheem Oct 18, 2019
2df4b38
Retry conflict errors on delete by query
vangheem Oct 31, 2019
04e9a39
Preparing release 3.3.20
vangheem Oct 31, 2019
1164526
Back to development: 3.3.21
vangheem Oct 31, 2019
9413725
guillotina_elasticsearch/py.typed
vangheem Nov 1, 2019
fc0e076
Preparing release 3.3.21
vangheem Nov 1, 2019
98eafaf
Back to development: 3.3.22
vangheem Nov 1, 2019
4423620
aioelasticsearch 0.6.0 is not compatible with ES 6 (#62)
masipcat Nov 13, 2019
af194e4
Preparing release 3.3.22
vangheem Nov 13, 2019
e29701a
Back to development: 3.3.23
vangheem Nov 13, 2019
759e9f1
fix default index settings
vangheem Nov 20, 2019
7ed61af
Preparing release 3.3.23
vangheem Nov 20, 2019
0ff6f41
Back to development: 3.3.24
vangheem Nov 20, 2019
71f75fd
Make sure to save sub index changes in ES
vangheem Nov 25, 2019
709ec86
Preparing release 3.3.24
vangheem Nov 25, 2019
b9f2679
Back to development: 3.3.25
vangheem Nov 25, 2019
00591cb
porting 7.0 es guillotina parser to 5.x
jordic Jan 17, 2020
68933bc
elastic7
jordic Feb 2, 2020
1ce7bd9
WIP elastic 7
jordic Feb 2, 2020
4d02574
fix
jordic Feb 2, 2020
d04ea83
revert experiment
jordic Feb 2, 2020
790d9f8
fix
masipcat Feb 11, 2020
2452249
another-fix
masipcat Feb 11, 2020
939acdf
improvements
masipcat Feb 11, 2020
0a37698
Small fixes
masipcat Feb 11, 2020
98f1c83
Merge pull request #65 from plone/3.3.x-elastic7-masip
masipcat Feb 12, 2020
21971ba
Merge branch 'master' into 3.3.x-elastic7
masipcat Feb 27, 2020
469cce0
Fixed tests and other changes
masipcat Feb 27, 2020
49761b3
flake8 + fixes
masipcat Feb 27, 2020
e9164dc
Fix tests failing with db dummy
masipcat Mar 10, 2020
0b6e0bd
Deleted flaky mark and pytest-reruns
masipcat Mar 10, 2020
55986fd
Removed guillotina_cms fields in SEARCH_DATA_FIELDS
masipcat Mar 10, 2020
c1ffe51
Remove print
masipcat Mar 11, 2020
8af23db
Updated dependencies in setup.py. Removed config.opendistro.json
masipcat Mar 12, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
dist: xenial
language: python
python:
- "3.7"
- "3.7"
sudo: required
env:
- DATABASE=DUMMY
- DATABASE=postgresql
- ES_VERSION=7
- ES_VERSION=7 DATABASE=postgresql

services:
- postgresql
Expand All @@ -23,7 +23,7 @@ cache:
- eggs
install:
- pip install flake8 codecov mypy_extensions
- pip install git+https://github.com/plone/guillotina.git@master
- pip install -e .
- pip install -e .[test]
script:
- flake8 guillotina_elasticsearch --config=setup.cfg
Expand Down
60 changes: 58 additions & 2 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,63 @@
5.0.1 (unreleased)
6.0.0 (unreleased)
------------------

- Nothing changed yet.
- Support Guillotina 6
[masipcat]

- Support elasticsearch 7.0
[jordic]

- Make sure to save sub index changes in ES
[vangheem]

- Fix default index settings
[vangheem]

- Pinned aioelasticsearch to <0.6.0
[masipcat]

- Be able to import types
[vangheem]

- Retry conflict errors on delete by query

- Pay attention to trashed objects in pg
- Fix commands using missing attribute `self.request`

- ISecurityInfo can be async

- Fix not iterating over all content indexes in elasticsearch
[vangheem]

- build_security_query(): changed 'query.bool.filter' to use a list instead of a single object
[masipcat]

- Fix release

- Missing pg conn lock with vacuuming
[vangheem]

- Pass request on the index progress when possible

- Fix release

- Do not require request object for vacuuming
[vangheem]

- G5 support
[vangheem]

- Do not close indexes on create/delete
[vangheem]

- Handle another index not found error on vacuum
[vangheem]

- logging
[vangheem]

- Handle index not found error
[vangheem]


5.0.0 (2019-10-21)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
5.0.1.dev0
6.0.0.dev0
23 changes: 0 additions & 23 deletions config-opendistro.json

This file was deleted.

81 changes: 38 additions & 43 deletions guillotina_elasticsearch/commands/vacuum.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from guillotina.component import get_utility
from guillotina.db import ROOT_ID
from guillotina.db import TRASHED_ID
from guillotina.db.reader import reader
from guillotina.utils import get_object_by_uid
from guillotina.interfaces import ICatalogUtility
from guillotina.tests.utils import get_mocked_request
from guillotina.tests.utils import login
Expand All @@ -19,14 +19,18 @@

import aioelasticsearch
import asyncio
import elasticsearch
import json
import logging


logger = logging.getLogger('guillotina_elasticsearch_vacuum')

GET_CONTAINERS = 'select zoid from {objects_table} where parent_id = $1'
SELECT_BY_KEYS = '''SELECT zoid from {objects_table} where zoid = ANY($1)'''
SELECT_BY_KEYS = f'''
SELECT zoid from {{objects_table}}
where zoid = ANY($1) AND parent_id != '{TRASHED_ID}'
'''
GET_CHILDREN_BY_PARENT = """
SELECT zoid, parent_id, tid
FROM {objects_table}
Expand All @@ -36,10 +40,10 @@

PAGE_SIZE = 1000

GET_OBS_BY_TID = """
GET_OBS_BY_TID = f"""
SELECT zoid, parent_id, tid
FROM {objects_table}
WHERE of is NULL
FROM {{objects_table}}
WHERE of is NULL and parent_id != '{TRASHED_ID}'
ORDER BY tid ASC, zoid ASC
"""

Expand Down Expand Up @@ -95,14 +99,17 @@ async def iter_batched_es_keys(self):
indexes.append(index['index'])

for index_name in indexes:
result = await self.conn.search(
index=index_name,
scroll='15m',
size=PAGE_SIZE,
_source=False,
body={
"sort": ["_doc"]
})
try:
result = await self.conn.search(
index=index_name,
scroll='15m',
size=PAGE_SIZE,
_source=False,
body={
"sort": ["_doc"]
})
except elasticsearch.exceptions.NotFoundError:
continue
yield [r['_id'] for r in result['hits']['hits']], index_name
scroll_id = result['_scroll_id']
while scroll_id:
Expand Down Expand Up @@ -161,33 +168,15 @@ async def get_object(self, oid):
if oid in self.cache:
return self.cache[oid]

try:
result = self.txn._manager._hard_cache.get(oid, None)
except AttributeError:
from guillotina.db.transaction import HARD_CACHE # noqa
result = HARD_CACHE.get(oid, None)
if result is None:
result = await self.txn._cache.get(oid=oid)

if result is None:
result = await self.tm._storage.load(self.txn, oid)

obj = reader(result)
obj.__txn__ = self.txn
if result['parent_id']:
obj.__parent__ = await self.get_object(result['parent_id'])
return obj
return await get_object_by_uid(oid)

async def process_missing(self, oid, index_type='missing', folder=False):
# need to fill in parents in order for indexing to work...
logger.warning(f'Index {index_type} {oid}')
try:
obj = await self.get_object(oid)
except KeyError:
except (AttributeError, KeyError, TypeError, ModuleNotFoundError):
logger.warning(f'Could not find {oid}')
return
except (AttributeError, TypeError, ModuleNotFoundError):
logger.warning(f'Could not find {oid}', exc_info=True)
return # object or parent of object was removed, ignore
try:
if folder:
Expand Down Expand Up @@ -302,17 +291,23 @@ async def check_missing(self):
async for batch in self.iter_paged_db_keys([self.container.__uuid__]):
oids = [r['zoid'] for r in batch]
indexes = self.get_indexes_for_oids(oids)
results = await self.conn.search(
','.join(indexes), body={
'query': {
'terms': {
'uuid': oids
try:
results = await self.conn.search(
index=','.join(indexes),
body={
'query': {
'terms': {
'uuid': oids
}
}
}
},
_source=False,
stored_fields='tid,parent_uuid',
size=PAGE_SIZE)
},
_source=False,
stored_fields='tid,parent_uuid',
size=PAGE_SIZE)
except elasticsearch.exceptions.NotFoundError:
logger.warning(
f'Error searching index: {indexes}', exc_info=True)
continue

es_batch = {}
for result in results['hits']['hits']:
Expand Down
2 changes: 1 addition & 1 deletion guillotina_elasticsearch/events.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ class IIndexProgress(Interface):
@implementer(IIndexProgress)
class IndexProgress(object):

def __init__(self, request, context, processed, total, completed=None):
def __init__(self, context, processed, total, completed=None, request=None): # noqa
self.request = request
self.context = context
self.processed = processed
Expand Down
7 changes: 7 additions & 0 deletions guillotina_elasticsearch/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,10 @@

class QueryErrorException(HTTPException):
status_code = 488


class ElasticsearchConflictException(Exception):
def __init__(self, conflicts, resp):
self.conflicts = conflicts
self.response = resp
super().__init__(f"{self.conflicts} on ES request")
Loading