Ticket: https://hysds-core.atlassian.net/browse/HC-259
Background information required to understanding the thought process and planning behind integrating A&A to HySDS
...
ex (with Keycloak
):
Code Block | breakoutMode | wide|
---|---|---|
| ||
curl -s -X POST \ -d client_id=<client_id> \ -d client_secret=<client_secret> \ -d grant_type=refresh_token \ -d refresh_token=<refresh_token> \ "http://localhost:8080/auth/realms/<realm>/protocol/openid-connect/token" | python -m json.tool |
...
SSO Providers:
Keycloak
Originally the plan was to use Keycloak for A&A:
...
Requires a SQL database (MySQL, PostgreSQL, etc.)
Keycloak guide from Red Hat on how to set up realms, client apps and client roles
uses Java’s
springboot
framework in the rest API integration but can be followed
OCIO advised against using Keycloak
, instead suggesting AWS Cognito
[meeting] with OCIO where 4 other projects are also working on
Jupyter
notebooks front-end to PCMs. The topic was raise for FN and public access to be able to sign into ADE+PCM for on-demand use. As a heads up, OCIO is recommending to not useKeycloak
and instead use AWS Cognito with some additional ELB proxies
AWS Cognito
According to this StackOverflow post:
Cognito exposes an OpenID Connect Discovery endpoint as described at https://openid.net/specs/openid-connect-discovery-1_0.html#ProviderConfigurationRequest at the following location:
https://cognito-idp.{region}.amazonaws.com/{userPoolId}/.well-known/openid-configuration
...
Role-based access control using Amazon Cognito and an external identity provider
https://docs.aws.amazon.com/cognito/latest/developerguide/what-is-amazon-cognito.html
https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-user-identity-pools.html
https://docs.aws.amazon.com/cognito/latest/developerguide/getting-started-with-identity-pools.html
JWT Tokens
Because AWS Cognito supports OpenID Connect, they supply users with a id_token
, refresh_token
and a access_token
example of a access_token
payload:
Code Block | ||
---|---|---|
| ||
{
"sub": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"device_key": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"cognito:groups": [
"admin"
],
"token_use": "access",
"scope": "aws.cognito.signin.user.admin",
"auth_time": 1562190524,
"iss": "https://cognito-idp.us-west-2.amazonaws.com/us-west-2_example",
"exp": 1562194124,
"iat": 1562190524,
"jti": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"client_id": "57cbishk4j24pabc1234567890",
"username": "janedoe@example.com"
} |
Subject (sub
)
The sub
claim is a unique identifier (UUID) for the authenticated user. It is not the same as the user name, which may not be unique.
Amazon Cognito groups (cognito:groups
)
The cognito:groups
claim is a list of groups the user belongs to (can be treated the same as roles)
Authentication time (auth_time
)
The auth_time
claim contains the time when the authentication occurred. Its value is a JSON
number that represents the number of seconds from 1970-01-01T0:0:0Z
as measured in UTC format. On refreshes, it represents the time when the original authentication occurred, not the time when the token was issued.
Issuer (iss
)
The iss
claim has the following format: https://cognito-idp.{region}.amazonaws.com/{userPoolId}
In the case (otello
, mozart
+ grq2
REST APIs) where a user would need to directly get a set of tokens directly (with username
+ password
) we can leverage boto3
to obtain it (as demonstrated in this StackOverflow post):
Code Block | ||
---|---|---|
| ||
def authenticate_and_get_token(username: str, password: str,
user_pool_id: str, app_client_id: str) -> None:
client = boto3.client('cognito-idp')
resp = client.admin_initiate_auth(
UserPoolId=user_pool_id,
ClientId=app_client_id,
AuthFlow='ADMIN_NO_SRP_AUTH',
AuthParameters={
"USERNAME": username,
"PASSWORD": password
}
)
print("Log in success")
print("Access token:", resp['AuthenticationResult']['AccessToken'])
print("ID token:", resp['AuthenticationResult']['IdToken']) |
ElasticSearch
Authenticating ElasticSearch directly would require a major update in the HySDS core (hysds_commons, hysds) to fetch an access_token
for every background process & celery
worker
An alternative is to authenticate at the proxy (apache
or nginx
) level:
This is a work in progress as a lot of research still needs to be done
only authenticate for ElasticSearch requests coming from outside the server (
hysds_ui
, etc)internal processes can hit ES directly without having to fetch an
access_token
beforehandNGINX OpenID Connect Implementation
uses
OpenResty
so it’ll require additional setup
current research documented in repo:https://github.com/DustinKLo/nginx-openid-demo